. Program 2 builds and tests a Cache class which is used for modeling the two levels of Cache needed for Program 3.
The cache is composed of a number of “lines” or “blocks”. A cache line includes a valid flag, a tag and LRU data. In the real world, it would also include a data block, but for this simulation, the data blocks don’t need to be modeled. To simplify code, we’ll leave it out.
For the ARM Cortex-A53 processor, there are two levels of cache. Level 1 is 32KB with 64-byte blocks and is organized as 40way set-associative. Level 2 cache is 1MB with 64-byte blocks but is 16-way.
The Cache class should have a constructor which is defined as
public Cache(int CacheSize, int BlockSize, int Associativity)
The cache itself would be an array of structs containing three parameters: a valid variable, a tag variable and an LRU number. The actual data in the cache is not represented because it’s not need for this program to run. The number of entries in the cache array depends on the size of cache and the number of bytes per block.
There is a method which is used to obtain information from the cache. The function prototype is
public bool Get(uint uAddress, uint uCycle)
The method returns true if there was a tag match (data was found in cache), or false if there was a miss. Since we’re not modeling data transfers in cache, just hits and misses, the interaction between cache levels is much simpler. On your first access to L1 Get, you’ll have a compulsory miss. Make the assumption that you’ll eventually get the data from L2 so before you return “false”, update the cache as if you got data from L2. That means set the valid bit, write the tag and the LRU information. Your method returns false (because there wasn’t a tag match) and you then call the L2 Get method. If it’s the first call of the program to L2, it too will miss and return a false, but like L1, you pretend you got the data from a lower-level store (main memory in this case) and set the appropriate valid flag, the LRU info and the tag.
Thus, if you go back into L1 again with an address that’s similar to the first address you sent to L1, you’ll get a hit. The L1 Get method returns true, and then you don’t go ask L2 about the data.
You will also need a method to print the valid contents of cache. An example from the L1 cache is
All blocks in a set are printed on the same line, although only sets with valid entries are printed. A set whose entries are all invalid will not appear. The above image is for the L1 cache which has 4 cache blocks per set. For the L2 cache, the print output would be much wider because there are 16 blocks per set. Here is what the L2 cache print would look like, although the blocks are truncated because of the width of the image: