What happens when we increase cache memory size?
How does it impact my system performance?
Does increasing cache size/RAM always improve my system performance?

Answers to a couple of these questions are not straightforward as they depend on several factors such as the type of application we run, processor architecture, etc. It is practically impossible and expensive to try out different hardware combinations (in this case cache memory) on actual systems to experiment out. In such cases, we can use computer architectural simulators to analyze various aspects of computer systems.

A computer architectural simulator is a software tool, used to model and mimic the behavior of a computer system’s architecture. It allows researchers, developers, and engineers to study and analyze various aspects of computer systems – such as processors, memory hierarchies, and interconnections—without needing physical hardware. Simulators provide a platform to design, experiment with, and evaluate computer systems in a virtual environment.

A computer architectural simulator helps users to:

Based on the simulation behavior simulators can be classified into:
Example: Simulating the sequence of instructions executed by a CPU.
Example: Evaluating the latency and throughput of a pipeline.
Example: Testing how an operating system performs on new hardware.
Let us glance through some of the popular computer architectural simulators that are used for academic and industrial research purposes

In this blog we will discuss gem5 further as it stands out among these simulators due to several key advantages:

Introduction to gem5

gem5 is a state-of-the-art open-source computer architecture simulator widely used in academia and industry for modeling and evaluating computer systems. It provides a flexible and modular framework for simulating diverse architectures, from simple single-core systems to complex multi-core and heterogeneous setups. gem5 is primarily used for research and development in computer architecture, system software, and hardware-software co-design. gem5 is written primarily in C++ and python. It can simulate a system with devices and an operating system in full system mode (FS mode) or user space-only programs where system services are provided directly by the simulator in syscall emulation mode (SE mode). gem5 supports executing Alpha, ARM, MIPS, Power, SPARC, RISC-V, and 64-bit x86 binaries on CPU models including two simple single CPI models, an out-of-order model, and an in-order pipelined model. It can also run precompiled binaries for performance evaluation.

Memory models

gem5 provides two memory models for simulating memory systems; classic and Ruby. The table below summarizes their key features

Feature Classic Model Ruby Model
Cache Coherence Protocols Predefined (MOESI, MESI) Fully customizable
Ease of Use Simple to configure and use Complex, requires expertise
Simulation Speed Faster Slower
Flexibility Limited High
Custom Protocol Support No Yes
Use Case General-purpose simulations Advanced research and experiments
Let’s discuss how to use gem5 and try some small exercises to familiarize yourself with the tool.
First, we need to install gem5, the following installation steps will help you with the same

Step 1: Install dependencies

sudo apt install build-essential git m4 scons zlib1g zlib1g-dev libprotobuf-dev protobuf-compiler libprotoc-dev libgoogle-perftools-dev python-dev python

Step 2: Clone gem5 repo

git clone https://github.com/gem5/gem5
Step 3: Build the system

We can build for any supported ISA, here I am taking RISCV as an example – scons build/RISCV/gem5.opt -j9

Experimenting with gem5

Now, let’s try out some experiments with the RISCV system we just built and analyze its performance.

Let’s start by measuring the level 2 cache misses and using an IPC performance metric (Instructions Per Cycle). We will experiment by changing the L2 cache size and see the impact on L2 miss rate and IPC.

For this we need to select one application, and create the application binary for the required ISA; in this case RISC-V binaries

I used Canneal from the PARSEC benchmark suite. We need to build the binaries for RISC-V using RISC-V toolchain

You can find the source code and steps to create riscv binaries for various applications including canneal from the link given below
https://github.com/RALC88/riscv-vectorized-benchmark-suite

After building the benchmark binaries, run the binaries with different cache sizes, in this case, I am experimenting with L2 cache size. For running canneal benchmark with L2 cache size 512 KB you can run this command given below: 

 ./build/RISCV/gem5.opt configs/deprecated/example/se.py –cmd=/home/siva/gem5/canneal_serial.exe –options=”1 15000 2000 input_can/200000.nets 64″  –caches –l2cache –l2_size=512kB –cpu-type=RiscvO3CPU

The possible command line arguments are given inside gem5/configs/common/Options.py file

Once the simulation ends you can check the stats file (gem5/m5out/stats.txt)  for required parameters. I have given the values for the L2 hit rate and IPC for different L2 cache sizes as references. The results demonstrate the impact of cache size on IPC and hit rate for a given application.

Leave a Reply

Your email address will not be published. Required fields are marked *