Computer Science Tutorial

Cache Memory Organisation

Locality of Reference

When a program is being executed in the cpu, all its instructions and data that cpu references generally tends to localize in a certain portion of memory. Say, a program which calls a function repeatedly, cpu will access the same memory location repeatedly where function is actually stored. This property of program is called Locality of reference.

Now, if we can move this memory location which is accessed frequently for a program execution, to a high speed memory, we can improve our instruction execution time to a large extent. This high speed memory is called Cache memory which compensates the speed differences between cpu and main memory. By moving the frequently accessed memory location to a fast cache memory we can decrease the average memory access time of the cpu which in turn decreases the instruction execution time for cpu.

Cache memory is placed between the cpu and ram. The cache is the fastest component in memory hierarchy and approaches the speed of cpu components.

The fundamental idea behind cache organization is to keep the most frequently accessed data and instruction in cache. This way we can make the average memory access time approaches to cache access time. Although a cache capacity is too small compared to random access main memory but it can improve average instruction execution time considerably.

Basic Operation of cache memory


When cpu makes references to a memory location, it is first searched in the cache memory. If the reference is found in the cache, the respective word is accessed. But, if memory location is not found in the cache, which denotes cache miss, main memory is accessed then. A block of words containing word referenced by the cpu is then brought into the cache from main memory. In this manner we transfer some extra words to cache so that subsequent request to memory can be find at fast cache memory.
hit ratio : The performance of cache memory is frequently measured in terms of a quantity called hit ratio. When cpu refers to memory and finds the word in cache, it is said to be hit. If referenced memory is not found in cache, then main memory is accessed, this is called a miss. The ratio of total number of hits divided by the total number of memory reference is called hit ratio. Thus higher the hit ratio, higher is the computer performance.

If the hit ratio is high enough so that most of the time cpu accesses cache memory instead of main memory. The average memory access time becomes equal to the access time of cache. Say, memory access time is 1000 ns, cache access time 100 ns, hit ratio 0.8. So, average memory access time becomes 280 ns which is quite a improvement over 1000 ns memory access time if there would be no cache memory.
The fundamental property of cache memory is its fast access time and hit ratio must approach to 1. To achieve fast access time we use static ram instead of dynamic ram. And to increase the hit ratio we must maintain levels of cache memory. Generally we build three levels of cache memory to keep more and more data in cache memory so that main memory is accessed less frequently thus increasing the hit ratio.

Types of Cache


Based on its position, relative to the cpu , it can be classified into three types.

L1 cache : It is the cache with smallest in capacity compared to others levels of cache. It is fastest memory in the memory hierarchy. Its capacity varies from 8 kB to 64 KB. In computers with multicore cpu chips, every core has its own local L1 cache which is not shared with other cpu core. In L1 cache, data and instruction are cached separately which means there are two separate caches for a single core one for data and another for instruction.

L2 cache : L2 cache is slightly larger in capacity and a bit slower compared to L1 cache. It may be embedded into the cpu chip or outside the chip in between cpu and ram. But in recent time, trend is to embed all the cache inside the cpu chip only. A L2 cache may be local to a cpu core or it shared among the other cores, which depends on the total number cores in the cpu chip. For a dual core system, L2 cache is local to a single core, it is not shared between the two core. But in quad core, a L2 cache is shared among the two that is a total of two caches for four cores. A L2 cache can be of size 256KB or more depending on the processor generation.

L3 cache : A L3 cache is slowest among the other two levels of cache memories. Its capacity is larger compared to other two. Its size generally in MBs. A L3 cache is generally shared among all the cpu cores. In earlier computers, L3 cache was embeded outside the cpu, but in recent times, it is also embedded into the cpu chip.

In some processors, where multi-level caches has been implemented, all data in L1 cache must be somewhere in L2 cache. These caches are called strictly inclusive. Whereas in some processors cache is exclusive that is data be at most in L1 or L2 cache but not in both.
BN Computer Academy
All Rights Reserved