# PRIME Power-efficient, Reliable, Many-core Embedded systems



**Engineering and Physical Sciences Research Council** 

# PRiME Demonstrator (2018) – Theme 2

### Introduction

- Heterogeneous multi-cores deal with multiple applications concurrently, having different performance requirements
- Runtime mapping and adaptation is required to meet such requirements under workload variabilities
- In such systems, it is challenging to exploit,
  - Various types of cores simultaneously
  - DVFS potential of cores

| Demonstrator                                                                                                                                                                                                    | System Model                                                                                                                                             |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|
| <ul> <li>Figure shows proposed energy efficient runtime management approach (ITMD) that,</li> <li>o does inter-cluster thread-to-core mapping (ITM) considering performance and resource constraints</li> </ul> | <ul> <li>Hardware Platform (Odroid-XU3) Specifications:-</li> <li>4 ARM cortex-A15 cores (A15 cluster) and 4 ARM cortex-A7 cores (A7 cluster)</li> </ul> |
|                                                                                                                                                                                                                 | • Shared L2 Cache per cluster; 2MB for big and                                                                                                           |

 Adapts to workload variations through voltagefrequency scaling based on the metric Memory Reads Per Instruction (MRPI)



- 512KB for LITTLE;
  - On-chip power and temperature sensors
- Operating frequency range
  - o big: 0.2GHz-2GHz with 100MHz step
  - LITTLE: 0.2GHz-1.4GHz with 100MHz step



#### **Applications:**

- Multi-threaded applications from PARSEC and SPLASH
- Parallelism can be set by a command-line argument

### Results

- Figure shows variation in frequency and workload (MRPI) for *ondemand* (HMPO) governor and *proposed* approach
- A high MRPI leads to scaling down the frequency
- Achieves up to 33% energy savings







Knowledge Transfer Network