EXTENSIBLE PROCESSORS VS ACCELERATORS – AND HOW RISC-V CHANGES THE DYNAMIC
If you were to ask any good designer today what is the best architecture for an SoC that needs to manage complex DSP or high bandwidth traffic demands – you will almost always find the recommendation of using one or more off-the-shelf processors, complimented by hardware accelerators to offload complex processing from the main cores. This solution should give the best power and performance outcome.
The accelerators are usually implemented as standalone RTL blocks connected to the main processor bus, and are optimized to be very efficient on the data types they work with. So on the surface they appear to be the logical choice to deliver optimal power and performance.
BUT, HOW DID THIS COMMON ARCHITECTURE COME ABOUT, AND IS IT ALWAYS THE BEST APPROACH?
The how it came about is an easy answer – it came about because when you have a fixed processor IP and an ISA that you cannot change, the use of accelerator IP to offload complex data manipulations is the only practical solution. So in a world dominated by ARM and MIPS, the use of hardware accelerators was the only option.
As they say, when all you have is a hammer, everything looks like a nail.
So the next part of the question is “Are accelerators the best solution?”
That is a much more nuanced discussion, and it is highly dependent on the specific application. What we can say is that for many of the cases when accelerators are used, they are sub optimal. In the narrow context of their own operation they save power (and processing time), however at a system level it may lead to greater power and processing time than the alternative.
The reason for this is that if you need flexible pre and/or post processing of data in addition to the primary data manipulation of the accelerator – you will find the application performing many CPU operations and many memory operation, in addition to the operation of the accelerator. The net result is that any advantage of the accelerator is offset by the overhead of pre and post processing.
WHAT DOES THIS HAVE TO DO WITH RISC-V?
Since RISC-V is both an Open and Extensible ISA – it means you can build an implementation that is compliant to the standard and as such able to take advantage of the rich software ecosystem (OS’, Libraries, etc) – while at the same time utilizing application specific processor optimization and extensions. Something that is not possible with a traditional ARM or MIPS processor.
The advantages of extensions rather than accelerators is that the main processor can do the needed data transformations in an highly efficient manner.
This means what would be in an accelerator context the following sequence
- Processor Data read
- Processor Data Pre Process
- Processor Data write
- Accelerator Init
- Accelerator Data Read
- Accelerator Data transform
- Accelerator Data Write
- Processor Data Read
- Processor Data Post Process
- Processor Data Write
Becomes
- Processor Data read
- Processor Data Pre Process
- Processor Data Transform (via processor extensions)
- Processor Data Post Process
- Processor Data Write
The drastic reduction in system traffic reduces overall system complexity and power. This is not a new concept, but the advent of the extensible RISC-V architecture makes it easier than ever to achieve.
SO ARE EXTENSIONS ALWAYS THE ANSWER?
I would love to say yes, but that would put us back into the “everything’s a nail” situation. The reality is that it depends on your data and your application.
Thanks to the extensibility of the Codix-Bk Core (RISC-V compliant) and the ease of modifying the implementation of RISC-V using Codix Optimizer you can easily decide if the best answer is an accelerator, or processor optimization/extensions. This is not something that has been possible previously. Sure there were extensible processors, but they locked you into a closed and limited ecosystem.