ASIP stands for “application-specific instruction-set processor” and simply means a processor which has been designed to be optimal for a particular application or domain.
General-purpose versus application- or domain-specific processors
Most processor cores to date have been general-purpose, which means that they have been designed to handle a wide range of applications with good average performance. This may mean that if you have some special computationally intensive algorithm, such as audio processing, you may need a high-performance core (for example with a SIMD unit, or zero overhead loops) or a high clock frequency to achieve your needs. This may result in exceeding your silicon or power budget.
An alternative is to create an ASIP which has a specialised architecture, optimised to efficiently achieve the required performance you require for the audio processing. The ASIP would not usually be designed to optimally handle more generic operations such as those needed for an operating system. Instead, if an OS is needed, you would probably run it on a separate general-purpose core which would not be constrained by the need to run audio processing algorithms. Thus, the ASIP design is optimised for performance with just enough flexibility to meet its use case.
ASIPs have been the subject of research in universities around the world and have been applied to a number of domains such as audio signal processing, image sensors, and baseband signal processing. Indeed, Codasip founder and CEO Karel Masařík with CTO Zdeněk Přikryl researched design automation for ASIPs at the TU Brno (see bibliography below).
Breakdown of Dennard Scaling
ASIPs or domain-specific processors are likely to be used more widely in the future due to semiconductor scaling issues. For decades, developers of SoCs have relied on Moore’s law and Dennard Scaling to get more and more performance and circuit density through successively finer silicon geometries.
DRAM pioneer Dr Robert Dennard observed that with each generation of technology, the dimensions of transistors went down by about 30 %, thus reducing their area by 50 %, and with smaller delays, the maximum clock frequency could increase by about 40 %. If, additionally, the supply voltage is reduced by 30 %, the power consumption for a transistor operating at maximum frequency is halved. Summing up: With each generation, the transistor density doubled and speed increased by 1.4 × while keeping the same power consumption.
While this scaling applied, it was generally reasonable to use general-purpose processor cores and to rely on new generations of silicon technology to deliver the performance required with an acceptable power consumption. However, Dennard Scaling has been known to have broken down from about 2006, with smaller increments in performance improvement and leakage current worsening power consumption. Because of this, the semiconductor industry must change, and a new approach to processing is needed.
Heterogeneous processing – the new way
To date, the breakdown in Dennard Scaling has been tackled by incorporating different types of general-purpose core on a single SoC. For example, mobile phone SoCs have combined application processors, GPUs, DSPs, and MCUs, but none of these classes of processor have application-specific instruction sets.
With new products demanding new algorithms for artificial intelligence, advanced graphics and advanced security more specialised accelerators are needed and are being developed. Such accelerators are designed to handle computationally demanding algorithms efficiently. Each accelerator will need an optimised instruction set and microarchitecture – in other words, be an ASIP.
An ASIP or accelerator is not a ‘one size fits all’ concept but can vary in the amount of specialization involved. At one end you might start with an MCU and enhance the performance or code density by adding custom instruction. On the other hand you might create specialized logic with very limited programmability.
When developing an instruction set, you do not necessarily have to start from scratch. If, for example, the RISC-V programming model is suitable, you can start with the base set and then develop whatever custom instructions you need.
Importance of Software Development
While applications software developers are increasingly moving to higher levels of abstraction, embedded software is still primarily done in C or C++ languages. These provide a close mapping to processor hardware while remaining ISA agnostic and relatively portable between processors and architectures. One challenge of creating custom hardware is ensuring the needs of software developers are met too. Techniques such as intrinsic instructions allow direct access to the instruction set from C, but reduce the flexibility of the code. They may still be a viable answer when complex functions are encapsulated in a single instruction. But better still would be a compiler that is created for your domain-specific accelerator, able to automatically target the custom instructions you create. This, along with an instruction set simulator and profiler, accelerates the rate at which you can try different design iterations to converge on an optimal solution. Codasip Studio provides a complete solution for this process with the ability to generate ISA and compiler from a simple instruction level model.
Design Automation & Verification
With the need to develop many and varied accelerators, it is not efficient to develop the instruction sets and microarchitecture manually. Application software can be analysed by profiling and the instruction set tuned to those needs. A processor design automation toolset like Codasip Studio can be then used to generate a software toolchain and an instruction set simulator. Specifically, Codasip Studio automates the generation of a C/C++ compiler that is fully aware of the instruction set and can infer specific instructions automatically. It is important to have the C/C++ compiler in the loop, to see the impact of the application-specific instructions. The same applies for instruction set simulators, debuggers, profilers, and other tools in SDK that are automatically generated. Codasip Studio can be also used to generate hardware design including RTL, testbenches, and a UVM environment. Codasip was originally founded to create co-design tools for ASIPs (hence the name “Co-dASIP”), although Codasip Studio has subsequently been used for creating more general-purpose cores such as RISC-V embedded and application cores too.
Processor design does not end with generating the RTL code – the dominant part of the design cycle is verification and it needs to be rigorous. As Philippe Luc explained to Semiconductor Engineering, RTL verification is multi-layered and complex. Codasip Studio can automate key parts of the verification process in order to help the full flow of ASIP design being quicker. Among other things, Codasip Studio provides automated coverage points for the optimised instructions. The provided UVM environment makes easy to run programs (including the new instructions) on both model and RTL with result comparison. A constrained random program generator is provided to help closing coverage and ensure quicker verification for our customers.
MASAŘÍK Karel, UML in design of ASIP, IFAC Proceedings Volumes 39(17):209-214, September 2006.
ZACHARIÁŠOVÁ Marcela, PŘIKRYL Zdeněk, HRUŠKA Tomáš and KOTÁSEK Zdeněk. Automated Functional Verification of Application Specific Instruction-set Processors. IFIP Advances in Information and Communication Technology, vol. 4, no. 403, pp. 128-138. ISSN 1868-4238.
PŘIKRYL Zdeněk. Fast Simulation of Pipeline in ASIP simulators. In: 15th International Workshop on Microprocessor Test and Verification. Austin: IEEE Computer Society, 2014, pp. 1-6. ISBN 978-0-7695-4000-9.
HUSÁR Adam, PŘIKRYL Zdeněk, DOLÍHAL Luděk, MASAŘÍK Karel and HRUŠKA Tomáš. ASIP Design with Automatic C/C++ Compiler Generation. Haifa, 2013.