trinayan/m2s_4.2_fused
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
===============================================================================
Coding Guidelines
===============================================================================
* Indenting:
Tabs should be used for indenting. It is recommended to configure your
editor with a tab size of 8 spaces for a coherent layout with other
developers of Multi2Sim.
* Line wrapping:
Lines should have an approximate maximum length of 80 characters,
including initial tabs (and assuming a tab counts as 8 characters).
Code should continue on the next line with an additional tab
otherwise. Example:
int long_function_with_many_arguments(int argument_1, char *arg2,
int arg3)
{
/* Function body */
}
* Comments:
Comments should not use the double slash '//' notation. They should
use instead the '/* */' notation. Multiple-line comments should
use a '*' character at the beginning of each new line, respecting
the 80-character limit of the lines.
/* This is an example of a comment that spans multiple lines. When the
* second line starts, an asterisk is used. */
* Code blocks:
Brackets in code blocks should not share a line with other code, both
for opening and closing brackets.
The opening parenthesis should have no space on the left for a function
declaration, but it should have it for 'if', 'while', and 'for' blocks.
Examples:
void function(int arg1, char *arg2)
{
}
if (cond)
{
/* Block 1 */
}
else
{
/* Block 2 */
}
while (1)
{
}
In the case of conditionals and loops with only one line of code in
their body, no brackets need to be used. Examples:
for (i = 0; i < 10; i++)
only_one_line_example();
if (!x)
a = 1;
else
b = 1;
while (list_count(my_list))
list_remove_at(my_list, 0);
* Enumerations:
Enumeration types should be named 'enum <xxx>_t', without using
'typedef' declarations. For example:
enum my_enum_t
{
err_list_ok = 0,
err_list_bounds,
err_list_not_fount
};
* Memory allocation:
Dynamic memory allocation should be done using functions xmalloc,
xcalloc, xrealloc, and xstrdup, defined in 'lib/mhandle/mhandle.h'.
These functions internally call their corresponding malloc, calloc,
realloc, and strdup, and check that they return a valid pointer, i.e.,
enough memory was available to be allocated. For example:
void *buf;
char *s;
buf = xmalloc(100);
s = xstrdup("hello");
This mechanism replaces the previous approach based on an explicit check
by the caller that the returned pointer is valid.
* Forward declarations:
Forward declarations should be avoided. A source file ".c" in a library
should have two sections (if used) declaring private and public
functions. Private functions should be defined before they are used
to avoid forward declarations. Public functions should be included
in the ".h" file associated with the ".c" source (or common for the
entire library). For example:
/*
* Private Functions
*/
static void func1()
{
...
}
[ 2 line breaks ]
static void func2()
{
...
}
[ 4 line breaks ]
/*
* Public Functions
*/
void public_func1()
{
...
}
* Variable declaration
Variables should be declared only at the beginning of code blocks (can
be primary or secondary code blocks). Variables declared for a code
block should be classified in categories, such as type, or location of
the code where they will be used. Several variables sharing the same
type should be listed in different lines. For example:
static void mem_config_read_networks(struct config_t *config)
{
struct net_t *net;
int i;
char buf[MAX_STRING_SIZE];
char *section;
/* Create networks */
for (section = config_section_first(config); section;
section = config_section_next(config))
{
char *net_name;
/* Network section */
if (strncasecmp(section, "Network ", 8))
continue;
net_name = section + 8;
/* Create network */
net = net_create(net_name);
mem_debug("\t%s\n", net_name);
list_add(mem_system->net_list, net);
}
}
* Function declaration:
When a function takes no input argument, its declaration should use the
'(void)' notation, instead of just '()'. If the header uses the empty
parentheses notation, the compiler sets no restrictions in the number of
arguments that the user can pass to the function. On the contrary, the
'(void)' notation forces the caller to make a call with zero arguments
to the function.
This affects the function declaration in a header file, a forward
declaration in a C file, and the header of the function definition in
the C file.
Example:
void say_hello(void); /* Header file or forward declaration */
void say_hello(void) /* Function implementation */
{
printf("Hello\n");
}
* Spaces:
Conditions and expressions in parenthesis should be have a space on the
left of the opening parenthesis. Arguments in a function and function
calls should not have a space on the left of the opening parenthesis.
if (condition)
while (condition)
for (...)
void my_func_declaration(...);
my_func_call();
No spaces are used after an opening parenthesis or before a closing
parenthesis. One space used after commas.
if (a < b)
void my_func_decl(int a, int b);
Spaces should be used on both sides of operators, such as assignments or
arithmetic:
var1 = var2;
for (x = 0; x < 10; x++)
a += 2;
var1 = a < 3 ? 10 : 20;
result = (a + 3) / 5;
Type casts should be followed by a space:
printf("%lld\n", (long long) value);
* Integer types:
Integer variables should be declared using built-in integer types, i.e.,
avoiding types in 'stdint.h' (uint8_t, int8_t, uint16_t, ...). The
main motivation is that some non-built-in types require type casts in 'printf'
calls to avoid warnings. Since Multi2Sim is assumed to run either on an
x86 or an x86_64 machine, the following type sizes should be used:
Unsigned 8-, 16-, 32-, and 64-bit integers:
unsigned char
unsigned short
unsigned int
unsigned long long
Signed 8-, 16-, 32-, and 64-bit integers:
char
short
int
long long
Integer types 'long' and 'unsigned long' should be avoided, since they
compile to different sizes in 32- and 64-bit platforms.
===============================================================================
Header Files
===============================================================================
* Code organization:
After SVN Rev. 1087, most of the files in Multi2Sim are organized as
pairs of header ('.h') and source ('.c') files, with the main exception
being the Multi2Sim program entry 'src/m2s.c'.
Every public symbol (and preferably also private symbols) used in a pair
of header and source files should use a common prefix, unique for these
files. For example, prefix 'x86_inst_' is used in files
'src/arch/x86/emu/inst.h' and 'src/arch/x86/emu/inst.c'.
* Structure of a source ('.c') file:
In each source file, the copyright notice should be followed by a set of
'#include' compiler directives. The source file should include only
those files containing symbols being used throughout the source, with
the objective of keeping dependence tree sizes down to a minimum. Please
avoid copying sets of '#include' directives upon creation of a new file,
and add instead one by one as their presence is required.
Compiler '#include' directives should be classified in three groups,
separated with one blank line:
1) Standard include files found in the default include directory
for the compiler (assert.h, stdio.h, etc), using the notation
based on '< >' brackets. The list of header files should be
ordered alphabetically.
2) Include files located in other directories of the Multi2Sim
tree. These files are expressed as a relative path to the
'src' directory (this is the only additional default path
assumed by the compiler), also using the '< >' bracket
notation. Files should be ordered alphabetically.
3) Include files located in the same directory as the source
file, using the double quote '" "' notation. Files should be
ordered alphabetically.
Each source file should include all header files defining the symbols
used in its code, and not rely on a multiple inclusion (i.e., a header
file including another header file) for this purpose. In general, header
files will not include any other header file.
For example:
/*
* Multi2Sim
* Copyright (C) 2012
* [...]
*/
--- One blank line ---
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <lib/mhandle/mhandle.h>
#include <lib/util/debug.h>
#include <lib/util/list.h>
#include "emu.h"
#include "inst.h"
--- Two blank lines ---
[ Body of the source file ]
--- One blank line at the end of the file ---
* Structure of a header ('.h') file:
The contents of a header file should always be guarded by an '#ifdef'
compiler directive that avoid multiple or recursive inclusions of it.
The symbol following '#ifdef' should be the equal to the path of the
header file relative to the simulator 'src' directory, replacing slashes
and dashes with underscores, and using capital letters.
For example:
/*
* Multi2Sim
* Copyright (C) 2012
* [...]
*/
--- One blank line ---
#ifndef ARCH_X86_EMU_CONTEXT_H
#define ARCH_X86_EMU_CONTEXT_H
--- Two blank lines ---
[ '#include' lines go here, with the same format and spacing as for
source files. But in most cases, there should be no '#include' here at
all! ]
[ Body of the header file, with structure, 'extern' variables, and
function declarations ]
--- Two blank lines ---
#endif
* Avoiding multiple inclusions
The objective of avoiding multiple inclusions is keeping dependence
trees small, and thus speeding up compilation upon modification of
header files. A multiple inclusion occurs when a header file includes
another header file. In most cases, this situation can avoided with the
following techniques:
- If a structure defined in a header file contains a field defined as a
pointer to a structure defined in a different header file, the
compiler does not need the latter structure to be defined. For
example, structure
struct x86_context_t
{
struct x86_inst_t *current_inst;
}
defined in file 'context.h' has a field of type 'struct x86_inst_t *'
defined in 'inst.h'. However, 'inst.h' does not need to be included,
since the compiler does not need to now the size of the substructure
to reserve memory for a pointer to it.
- If a function defined in a header file contains an argument whose type
is a pointer to a structure defined in a different header file, the
latter need not be included either. Instead, a forward declaration of
the structure suffices to get rid of the compiler warning. For
example:
struct dram_device_t;
void dram_bank_create(struct dram_device_t *device);
In the example above, function 'dram_bank_create' would be defined in
a file named 'dram-bank.h', while structure 'struct dram_device_t'
would be defined in 'dram-device.h'. However, the latter file does not
need to be included as long as 'struct dram_device_t' is present as a
forward declaration of the structure.
Exceptions:
- When a derived type (type declared with 'typedef') declared in a
secondary header file is used as a member of a structure or argument
of a function declared in a primary header file, the secondary header
file should be included as an exception. This problem occurs often
when type 'FILE *' is used in arguments of 'xxx_dump' functions. In
this case, header file 'stdio.h' needs to be included.
- When a primary file defines a structure using a field of type 'struct'
(not a pointer) defined in a secondary header file, the secondary
header file needs to be included. In this case, the compiler needs to
know the exact size of the secondary structure to allocate space for
the primary.
For example, structure 'struct evg_bin_format_t' has a field of type
'struct elf_buffer_t'. Thus, header file 'bin-format.h' needs to
include 'elf-format.h'.
===============================================================================
Dynamic Structures (Objects)
===============================================================================
Although Multi2Sim is written in C, its structure is based in a set of
dynamically allocated objects (a processor core, a hardware thread, a
micro-instruction, etc.), emulating the behavior of an object oriented
language.
An object is defined with a structure declaration (struct), one or more
constructor and destructor functions, and a set of other functions
updating or querying the state of the object.
The structure and function headers should be declared in the '.h' file
associated with the object, while the implementation of public functions
(plus other private functions, structures, and variables) should appear
in its associated '.c' file.
In general, every object should have its own pair of '.c' and '.h' files
implementing and encapsulating its behavior.
As an example, let us consider the object representing an x86
micro-instruction, called 'x86_uop_t' and defined in
src/arch/x86/timing/uop.h
The implementation of the object is given in an independent the
associated C file in
src/arch/x86/timing/uop.c
The constructor and destructor of all objects follow the same template.
An object constructor returns a pointer to a new allocated object, and
takes zero or more arguments, used for the object initialization. The
code in the constructor contains two sections: initialization and
return, as shown in this example:
struct my_struct_t *my_struct_create(int field1, int field2)
{
struct my_struct_t *my_struct;
/* Initialize */
my_struct = xcalloc(1, sizeof(struct my_struct_t));
my_struct->field1 = field1;
my_struct->field2 = field2;
my_struct->field3 = 100;
/* Return */
return my_struct;
}
An object destructor takes a pointer to the object as the only argument,
and returns no value:
void my_struct_free(struct my_struct_t *my_struct)
{
[ ... free fields ... ]
free(my_struct);
}
===============================================================================
Multi2Sim Runtime Libraries
===============================================================================
In some cases, Multi2Sim requires a benchmark to be linked with specific runtime
libraries for correct simulation. Currently, this is the case of GPU simulation,
requiring OpenCL, CUDA, or OpenGL runtimes. Runtime libraries are linked
statically or dynamically with the application running on Multi2Sim (guest
code), and are not part of the source code compiled with the main 'configure'
and 'make' commands. Runtime libraries are found in directories
/tools/libm2s-XXX
As guest code, they are compiled targeting a 32-bit x86 architecture. When
running 'make' on a runtime library directory, two versions of it are generated,
one for dynamic and another for static linking:
/tools/libm2s-XXX/libm2s-XXX.a
/tools/libm2s-XXX/libm2s-XXX.so
A runtime library communicates with Multi2Sim through a specific system call
code, associated uniquely with that library, and not part of the standard Linux
system call codes. For example, the CUDA runtime library is associated with code
328.
The first argument of the system call is a function code, and the rest of the
arguments depend on the prototype defined for that function. The set of possible
function codes and their arguments define the Multi2Sim runtime library
interface, and both the runtime library (guest code) and the simulator (host
code) need to agree on it.
* Steps involved in a runtime function call
Let us assume a CUDA application statically linked with the Multi2Sim
CUDA runtime library, and running on Multi2Sim. When the application
performs a CUDA call, e.g. cuMalloc, the guest control flow jumps to the
implementation of the CUDA runtime library (still guest code).
Eventually, the runtime library might require a service from the
simulator, such as returning the pointer to the next available memory
region in GPU simulated memory. For this purpose, a system call with
code 328 will be performed, which is captured by the simulator in the
code implemented in file
src/arch/x86/emu/syscall.c
This file contains the implementation of the most common Linux system
calls. However, the implementation of runtime libraries interface should
be implemented in a separate file. In the case of the CUDA runtime
interface, the system call just calls function 'frm_cuda_call()',
implemented in
src/arch/fermi/emu/cuda.c
[ It is still to be defined what is the standard directory for
the host implementation of the runtime library interface, i.e.,
whether its part of the target GPU architecture, or the host x86
architecture. ]
* Runtime library calls
The set of calls that define a runtime library interface should be
declared in a *.dat file in the Multi2Sim host code. For example, the
set of functions for the new OpenCL runtime library is defined in
src/arch/x86/emu/clrt.dat
This file contains a list of calls (macro uses), on for each library
function. The arguments are the function name, followed by its
associated function code. The name of the first function should be
'init', devoted for version compatibility control and native/simulated
execution control (see below).
There are several positions in the code where a list of all library
functions is required. For example, an enumeration type
'enum x86_clrt_call_t' is defined, containing as many enumeration
constants as library function calls. This can be done by defining macro
'X86_CLRT_DEFINE_CALL' accordingly, and then including the 'clrt.dat'
file. This technique allows for the list of library calls to be
centralized in a single file, avoiding to update several code locations
every time a new call is added to the interface.
The implementation of each library function call should be part of the
simulator code, in the main *.c file associated with the runtime library
interface implementation. In the case of the OpenCL runtime, the
functions are found at the end of file
src/arch/x86/emu/clrt.c
The implementation of each function should be preceding by a multi-line
comment clearly specifying the prototype for that function (how many
arguments it has, the type of arguments, the definition of structures
pointed to by function arguments, etc.).
* Version control for a Multi2Sim runtime library interface
The first runtime function call should be called 'init', and should be
used both to initialize the host environment, and to carry out a version
compatibility check between the guest runtime library and the Multi2Sim
implementation of the runtime interface.
During the development of the Multi2sim runtime library, its interface
with the simulator evolves and changes, causing an older statically
pre-compiled application to be incompatible with the newer version of
the simulator, or vice versa. If this fact is silently ignored,
unexpected behaviors would be observed, and the cause of the errors
would be hard for the user to figure out.
Both the runtime library and the Multi2Sim implementation have a version
identifier formed of two numbers (major and minor version). For example,
the newer OpenCL runtime library defines
X86_CLRT_VERSION_MAJOR
X86_CLRT_VERSION_MINOR
in the Multi2Sim implementation at
/src/arch/x86/emu/clrt.c
while it defines
M2S_CLRT_VERSION_MAJOR
M2S_CLRT_VERSION_MINOR
in the runtime library implementation at
/tools/libm2s-clrt/m2s-clrt.h
When the interface between the runtime library and the Multi2sim
implementation is modified or extended, the version numbers should be
updated with the following criterion:
1) If the guest library requires a new feature from the host
implementation, the feature is added to the host, and the minor
version is updated to the current Multi2Sim SVN revision both in
host and guest. All previous services provided by the host
should remain available and backward-compatible. Executing a
newer library on the older simulator will fail, but an older
library on the newer simulator will succeed.
2) If a new feature is added that affects older services of the
host implementation breaking backward compatibility, the major
version is increased by 1 in the host and guest code. Executing
a library with a different (lower or higher) major version than
the host implementation will fail.
The 'init' function call in the runtime library will pass a pointer to a
structure of type
struct m2s_runtime_version_t
{
int major;
int minor;
}
containing the runtime library version. The host implementation reads
these values and compares them with the version of the runtime library
interface, according to the rules above. The 'init' function should
always return 0, or cause the program to exit.
* Native vs. simulated execution of the Multi2Sim runtime library
An application statically linked with a Multi2Sim runtime library is
aimed at running on Multi2Sim. However, a user could attempt to run it
natively as well. It is up to the Multi2Sim developer whether support
for native execution should be given as well in the runtime library. For
example, an OpenCL runtime library could use only its x86 OpenCL runtime
capabilities when run natively, and exploit Multi2Sim's support for GPU
simulation when run on the simulator.
It is easy for the Multi2Sim runtime library to find out whether it is
running natively or on the simulator, by checking the error code of the
runtime function call devoted for version control. On Multi2Sim, this
function call should always return 0, while on a native execution, the
system call will return -1, since it is not part of the Linux system
call interface.
If the runtime library detects native execution, and it is not a feature
supported for the library, it should output a clear error message, and
finalize the program. For example:
error: OpenCL program cannot be run natively.
This is an error message provided by the Multi2Sim OpenCL
library (libm2s-opencl). Apparently, you are attempting to run
natively a program that was linked with this library. You should
either run it on top of Multi2Sim, or link it with the OpenCL
library provided in the ATI Stream SDK if you want to use your
physical GPU device.
* Debug information for the host implementation of the runtime library interface
For every runtime library interface, such as the implementation of the
new OpenCL runtime interface present in
src/arch/x86/clrt.c
there should be a command-line option for the simulator executable (m2s)
that provides a trace of all calls performed by the runtime. In this
case, the option should be '--debug-x86-clrt <file>'. The steps to
include this option are:
1) Define 'x86_clrt_debug_category' in
src/arch/x86/clrt.c
and make it a public variable by including its external
definition in the suitable section of
src/arch/x86/x86-emu.h
2) Define 'x86_clrt_debug' macro in the corresponding section of
src/arch/x86/x86-emu.h
3) Add all code needed for a new debug category in the main
Multi2Sim program, add the new command-line option, and update
the help message at
src/m2s.c
* Debug information for the Multi2Sim runtime library
The runtime library itself is guest code, that can potentially run both
on Multi2Sim or natively. Thus, it is not possible to use a simulator
command-line option to activate debug information in a runtime library.
Instead, an environment variable is used for this purpose.
For example, the new OpenCL runtime library uses environment variable
'M2S_CLRT_DEBUG' to activate debug information. If the variable is set
to 1, the runtime library should dump information about every library
function called by the application, in the same format as the system
calls are dumped in
src/arch/x86/emu/syscall.c
As an example, the OpenCL runtime library uses function 'm2s_clrt_debug'
for debugging purposes, implemented in
tools/libm2s-clrt/m2s-clrt.c
===============================================================================
Steps to Publish a New Version Release (Administrator):
===============================================================================
* Try to generate distribution version on both 32-bit and 64-bit
machines. Compilation might cause different warnings. Compile in debug
mode, and also generate distribution packages.
./configure --enable-debug
make
./configure
make distcheck
* svn commit all previous changes
* Run 'svn update' + 'svn2cl.sh'
* Update AM_INIT_AUTOMAKE in 'configure.ac'
* Remake all to update 'Makefiles'
~/multi2sim/trunk$ make clean
~/multi2sim/trunk$ make
* Add line in Changelog:
"Version X.Y.Z released"
* Copy 'trunk' directory into 'tags'. For example:
~/multi2sim$ svn cp trunk tags/multi2sim-X.Y.Z
* svn commit
* In trunk directory, create tar ball.
~/multi2sim/trunk$ make distcheck
* Check that all additional distribution files were included in the
distribution package. Check that:
* 'libm2s-glut' compiles correctly.
* 'libm2s-opengl' compiles correctly.
* 'libm2s-opencl' compiles correctly.
* 'libm2s-cuda' compiles correctly.
* Copy tar ball to Multi2Sim server:
scp multi2sim-X.Y.Z.tar.gz $(M2S-SERVER):public_html/files/
* Update Multi2Sim web site.
* Log in.
* Click toolbox -> Special Pages -> Uncategorized templates
* Update 'Latest Version' and 'Latest Version Date'
* Send email to [email protected]
===============================================================================
Command to update Copyright in all files:
===============================================================================
In the Multi2Sim trunk directory, run:
$ sources=`find -regex '.*/.*\.\(c\|cpp\|h\|dat\)$'`
$ sed -i "s,^ \* Copyright.*$, * Copyright (C) 2011 Rafael Ubal ([email protected])," $sources
===============================================================================
Command to count lines of code
===============================================================================
In the Multi2Sim trunk directory, run:
$ wc -l `find -regex '.*/.*\.\(c\|cpp\|h\|dat\)$'`
===============================================================================
Notes about the memory system
===============================================================================
When a memory address is accessed in an SRAM bank, the access time is calculated based on the following time components:
* T_precharge: time to store the contents of the row buffer into its corresponding memory location.
* T_row_buf_access: base access time to the row buffer.
* T_activate: time to load a memory location into the row buffer.
There are three possibilities to compute the access time T_access, one for each of the following scenarios:
1) Row-closed. The row buffer is empty, and the new row needs to be activated before being accessed. T_access = T_activate + T_row_buf_access.
2) Row-hit. The row buffer contains the requested address. T_access = T_row_buf_access.
3) Row-conflict. The row buffer contains a different address. T_access = T_precharge + T_activate + T_row_buf_access.
===============================================================================
Help message for memory system configuration
===============================================================================
Option '--mem-config <file>' is used to configure the memory system. The
configuration file is a plain-text file in the IniFile format. The memory system
is formed of a set of cache modules, main memory modules, and interconnects.
Interconnects can be defined in two different configuration files. The first way
is using option '--net-config <file>' (use option '--help-net-config' for more
information). Any network defined in the network configuration file can be
referenced from the memory configuration file. These networks will be referred
hereafter as external networks.
The second option to define a network straight in the memory system
configuration. This alternative is provided for convenience and brevity. By
using sections [Network <name>], networks with a default topology are created,
which include a single switch, and one bidirectional link from the switch to
every end node present in the network.
The following sections and variables can be used in the memory system
configuration file:
Section [General] defines global parameters affecting the entire memory system.
PageSize = <size> (Default = 4096)
Memory page size. Virtual addresses are translated into new physical
addresses in ascending order at the granularity of the page size.
Section [Module <name>] defines a generic memory module. This section is used to
declare both caches and main memory modules accessible from CPU cores or GPU
compute units.
Type = {Cache|MainMemory} (Required)
Type of the memory module. From the simulation point of view, the
difference between a cache and a main memory module is that the former
contains only a subset of the data located at the memory locations it
serves.
Geometry = <geo>
Cache geometry, defined in a separate section of type [Geometry <geo>].
This variable is required for cache modules.
LowNetwork = <net>
Network connecting the module with other lower-level modules, i.e.,
modules closer to main memory. This variable is mandatory for caches, and
should not appear for main memory modules. Value <net> can refer to an
internal network defined in a [Network <net>] section, or to an external
network defined in the network configuration file.
LowNetworkNode = <node>
If 'LowNetwork' points to an external network, node in the network that
the module is mapped to. For internal networks, this variable should be
omitted.
HighNetwork = <net>
Network connecting the module with other higher-level modules, i.e.,
modules closer to CPU cores or GPU compute units. For highest level
modules accessible by CPU/GPU, this variable should be omitted.
HighNetworkNode = <node>
If 'HighNetwork' points to an external network, node that the module is
mapped to.
LowModules = <mod1> [<mod2> ...]
List of lower-level modules. For a cache module, this variable is rquired.
If there is only one lower-level module, it serves the entire address
space for the current module. If there are several lower-level modules,
each served a disjoint subset of the address space. This variable should
be omitted for main memory modules.
BlockSize = <size>
Block size in bytes. This variable is required for a main memory module.
It should be omitted for a cache module (in this case, the block size is
specified in the corresponding cache geometry section).
Latency = <cycles>
Memory access latency. This variable is required for a main memory module,
and should be omitted for a cache module (the access latency is specified
in the corresponding cache geometry section in this case).
AddressRange = { BOUNDS <low> <high> | ADDR DIV <div> MOD <mod> EQ <eq> }
Physical address range served by the module. If not specified, the entire
address space is served by the module. There are two possible formats for
the value of 'Range':
With the first format, the user can specify the lowest and highest byte
included in the address range. The value in <low> must be a multiple of
the module block size, and the value in <high> must be a multiple of the
block size minus 1.
With the second format, the address space can be split between different
modules in an interleaved manner. If dividing an address by <div> and
modulo <mod> makes it equal to <eq>, it is served by this module. The
value of <div> must be a multiple of the block size. When a module serves
only a subset of the address space, the user must make sure that the rest
of the modules at the same level serve the remaining address space.
Section [CacheGeometry <geo>] defines a geometry for a cache. Caches using this
geometry are instantiated [Module <name>] sections.
Sets = <num_sets> (Required)
Number of sets in the cache.
Assoc = <num_ways> (Required)
Cache associativity. The total number of blocks contained in the cache
is given by the product Sets * Assoc.
BlockSize = <size> (Required)
Size of a cache block in bytes. The total size of the cache is given by
the product Sets * Assoc * BlockSize.
Latency = <cycles> (Required)
Hit latency for a cache in number of cycles.
Policy = {LRU|FIFO|Random} (Default = LRU)
Block replacement policy.
MSHR = <size> (Default = 16)
Miss status holding register (MSHR) size in number of entries. This value
determines the maximum number of accesses that can be in flight for the
cache, including the time since the access request is received, until a
potential miss is resolved.
Ports = <num> (Default = 2)
Number of ports. The number of ports in a cache limits the number of
concurrent hits. If an access is a miss, it remains in the MSHR while it
is resolved, but releases the cache port.
Section [Network <net>] defines an internal default interconnect, formed of a
single switch connecting all modules pointing to the network. For every module
in the network, a bidirectional link is created automatically between the module
and the switch, together with the suitable input/output buffers in the switch
and the module.
DefaultInputBufferSize = <size>
Size of input buffers for end nodes (memory modules) and switch.
DefaultOutputBufferSize = <size>
Size of output buffers for end nodes and switch.
DefaultBandwidth = <bandwidth>
Bandwidth for links and switch crossbar in number of bytes per cycle.
Section [Entry <name>] creates an entry into the memory system. An entry is a
connection between a CPU core/thread or a GPU compute unit with a module in the
memory system.
Type = {CPU | GPU}
Type of processing node that this entry refers to.
Core = <core>
CPU core identifier. This is a value between 0 and the number of cores
minus 1, as defined in the CPU configuration file. This variable should be
omitted for GPU entries.
Thread = <thread>
CPU thread identifier. Value between 0 and the number of threads per core
minus 1. Omitted for GPU entries.
ComputeUnit = <id>
GPU compute unit identifier. Value between 0 and the number of compute
units minus 1, as defined in the GPU configuration file. This variable
should be omitted for CPU entries.
DataModule = <mod>
Module in the memory system that will serve as an entry to a CPU
core/thread when reading/writing program data. The value in <mod>
corresponds to a module defined in a section [Module <mod>]. Omitted for
GPU entries.
InstModule = <mod>
Module serving as an entry to a CPU core/thread when fetching program
instructions. Omitted for GPU entries.
Module = <mod>
Module serving as an entry to a GPU compute unit when reading/writing
program data in the global memory scope. Omitted for CPU entries.
===============================================================================
Help message in IPC report file
===============================================================================
The IPC (instructions-per-cycle) report file shows performance value for a
context at specific intervals. If a context spawns child contexts, only IPC
statistics for the parent context are shown. The following fields are shown in
each record:
<cycle>
Current simulation cycle. The increment between this value and the value
shown in the next record is the interval specified in the context
configuration file.
<inst>
Number of non-speculative instructions executed in the current interval.
<ipc-glob>
Global IPC observed so far. This value is equal to the number of executed
non-speculative instructions divided by the current cycle.
<ipc-int>
IPC observed in the current interval. This value is equal to the number
of instructions executed in the current interval divided by the number of
cycles of the interval.