Creating specialized architectures with processor design automation

Abstract background with integrated circuit

Creating specialized architectures with processor design automation

With semiconductor scaling slowing down if not failing, SoC designers are challenged to find ways of meeting the demand for greater computational performance. In their 2018 Turing lecture, Hennessey & Patterson pointed out that new methods are needed to work around failing scaling and predicted ‘A Golden Age for Computer Architecture’. A key approach in addressing this challenge is to innovate architecturally and to create more specialized processing units – domain-specific processors and accelerators.
If you haven’t already, I recommend reading our white paper on semiconductor scaling and what is next for processors.

Automation for creating application-specific processors

Specialized processor cores are, by definition, going to vary a lot depending on their workload. Some may be readily developed by customizing existing RISC-V processor cores. However, that approach will not work in every instance and sometimes developing a novel architecture may be necessary. In such cases it will be necessary to explore the instruction set architecture (ISA) and microarchitecture to find a good design solution.

Traditionally custom cores were developed by manually creating an instruction set simulator (ISS), software toolchain, and RTL. This process can be time-consuming and error prone. The alternative is to describe the processor core in a high-level language and to use processor design automation to generate the ISS, software toolchain, RTL and verification environment.

This is exactly what Codasip offers. Codasip Studio is our unique processor design automation toolset. It has been applied to RISC, DSP, and VLIW designs, and we use it for developing our own RISC-V cores. Processors are described using the CodAL architectural language which covers both instruction accurate (IA) and cycle accurate (CA) descriptions.

Exploring the instruction set with Codasip Studio

The first stage in developing a core optimized to a particular application is to define the instruction set. If a RISC-V standard wordlength such as 32-bits or 64-bits is acceptable, then the corresponding base integer instruction set can be a good starting point to save time. However, if a different wordlength such as 8- or 16-bits is chosen then RISC-V cannot be directly used. Instead instructions based on the smaller wordlength should be developed, such as Google did when developing their 8-bit tensor processing unit (TPU).
CodAL semantics and content
Codasip CodAL
Once an initial instruction set is defined, an iterative exploration phase with the SDK in the loop can be undertaken. Rather than using an open-source toolchain or instruction set simulator such as GCC, LLVM or Spike, we recommend describing the instruction set, resources and semantics using our CodAL processor description language. Once the instruction set is modelled, the software toolchain (or SDK) can be automatically generated using the Codasip Studio toolset. The SDK can be used to profile real application software and any hotspots in the software can be identified.
SDK in the loop - architecture exploration
SDK in the loop
Once hotspots are known, the instruction set can be adjusted to better meet the needs of the computational workload. The updated description in CodAL can be used to generate an SDK for further profiling. This can be repeated through further iterations until the ISA is finalized.

Defining the microarchitecture with Codasip Studio

With a stable instruction set, the next phase is to develop the microarchitecture using CodAL. Decisions on forms of parallelism (such as multi-issue) or pipeline length or privilege modes need to be taken. This leads to a new iterative phase with the HDK in the loop.
Image
HDK in the loop
CodAL is used to define a cycle accurate description of the core using the existing instruction set and resources. This in turn can be used to generate a CA simulator, RTL and testbench. The performance of the design can be assessed using the CA simulator, RTL simulation or an FPGA prototype. The RTL can be analyzed for silicon area and timing. If the microarchitecture is not meeting its targets, it can be modified and the HDK regenerated.
The final and essential stage is to verify the final RTL against an IA golden reference. Codasip Studio can generate the UVM verification environment for this. Additionally, Codasip Studio has tools to help with functional coverage and to generate random assembler programs. Users should also use their own directed tests and apply multiple verification strategies.
Image

Roddy Urquhart

Senior Marketing Director

Share the post on social media!