Energy Saving Opportunities and Strategies for Multicore Embedded Systems

This article lists some of the major limitations of multicore embedded systems and discusses possible solutions to these limitations. Taking some embedded systems as an example, we will point out the opportunity to use the existing architecture to improve the energy efficiency of the system. The combination of multi-core processors and emerging embedded platforms can meet the high computing power required by modern embedded applications. However, such embedded applications require high frequency switching, which results in higher power consumption, excessive chip temperature, and power supply ground noise. Developers can use this article to identify opportunities to improve the energy efficiency of modern embedded systems and to understand the possibilities for maximizing power efficiency.

Independent energy saving of multi-core processors

This article takes Oracle (Oracle) / SunMicrosystem's UltraSPARC T1 processor as an example. The reason for choosing the UltraSPARC T1 is that its design source code, simulation tools, and design verification suite are open source and can be downloaded from the Oracle website. This article will use this case to discuss where and how to achieve energy savings.

Figure 2 shows the trap logic unit associated with each core of the processor. Traps enable control vector transfer of software from low-level to advanced privileged modes, such as from user mode to management or supervisory mode. For the UltraSPARC T1 processor, Tcc instructions and exceptions, resets, asynchronous errors, or interrupt requests caused by instructions can cause traps.

Figure 2: Trap logic unit.

Figure 2: Trap Logic Unit

Often, traps cause the SPARC pipeline to be flushed (Flush). The processor state will be stored in the trap register stack and the trap handler code will be executed. The actual transfer of control is achieved by a trap table containing the first eight instructions of each trap handler. The virtual base address of the table used to pass traps into privileged mode is specified in the trap base address (TBA) register. The displacement in the table depends on the type of trap and the current trap level. The trap handler code is executed when a DONE or RETRY instruction is encountered. Traps may be synchronous or asynchronous with the SPARC core pipeline. Figure 2 shows the trap control and data flow in the TLU associated with other hardware modules in the SPARC core. The priority of traps passed in from IFU, EXU, LSU, and TLU is first parsed, and the parsed trap type is determined. Depending on the type of trap and if there are no other higher priority interrupts or asynchronous traps pending in the queue, the system will send a flush signal to the LSU to submit all previously unfinished commands. In addition, the trap type also determines what processor status registers need to be stored in the trap register stack. After that, the trap base address will be selected and sent to the pipeline for further execution.

Figure 3: Chip block diagram.

Figure 3: Chip Block Diagram

Figure 3 shows the chip layout of a multicore embedded processor. The processor has a variable number of cores, L2 banks, out-of-core floating point units (FPUs), and input and output logic, and they are interconnected by a network on the chip. In the CASPER simulation environment, designers can modify various architectural parameters.

Energy saving opportunity

For the above multi-core embedded processors, the following kernel-level and chip-level power saving candidate elements (PSCs) have been identified:

1. Register file, which is a thread-specific unit. Each thread has a 160-word (64-bit) register file and can save a lot of power when a thread's task is blocked or idling.

2. The load miss queue (LMQ) used to arrange the data when the data cache misses. Load miss queues can be shared between threads, but this method saves less power.

3. Branch predictor. Branch history tables can be thread-specific, so you can save a lot of power.

4. The entire kernel can save a lot of power when all tasks in all threads in the kernel are blocked or idle, or when no tasks are dispatched to any thread in the kernel.

5. A trap unit in the kernel for hardware and software interrupts. The results of the study show that in the UltraSPARC T1 processor, the trap instructions for typical SPECJBB network processing applications account for less than 1% of all instructions. This shows that the trap unit is a very good potential energy saving element. Note that although the rest of the trap logic may be in power-saving mode most of the time, the input receive queue that receives the trap needs to remain active all the time, but the power consumption of the queue is negligible.

6. A DMA controller for L2 buffering that controls the flow of data between the buffer and the input and output buffers.

7. Command and data queues between the kernel and the L2 cache.

8. When the off-chip cache or main memory needs to be accessed, the cache miss path logic will only be activated if there is a cache miss on the on-chip L2 cache.

Photocatalysis Air Cleaner

Air Duct Cleaning,Air Purifier Uv,Air Cleaner Purifier,Photocatalysis Air Cleaner

Dongguan V1 Environmental Technology Co., Ltd. , https://www.v1airpurifier.com