Title: ASICImplementations
Date: 2016-12-15 22:44
The aim of the Cryptech project is to develop an open, free, and
auditable HSM. The Cryptech HSM includes both SW and HW parts. In at
least the first iteration of the Cryptech HSM, the HW parts are
implemented using FPGA devices. However, the ability to implement the
HW parts in a Cryptech ASIC device in a future iteration is anticipated
in the design. This text provides a short description of what the HW
part of the Cryptech HSM contains, the design style used, and what would
have to change in order to implement the HW part in an ASIC.
The Cryptech digital functionality cores, such as the SHA-256 core, are
written in generic RTL (Register Transfer Level) Verilog code. The code
is written in a fairly conservative coding style and use language
features from IEEE 1364-2001 (aka Verilog 2001).
All RTL code is divided into modules that contain one process for register updates and reset (reg_update), one or more combinational processes for datapath and support logic such as counters. Finally if needed, each module has a separate process that implements the logic for the final state machine that controls the behaviour of the module.
All cores are divided into a core, for example sha256_core.v and a number of submodules the core instantiates. The core provides raw, wide ports (256 bit wide key for AES for example) that is not suitable to use in a stand alone system. Instead each core comes with a top level wrapper, for example sha256.v. This top level wrapper contains all registers and logic needed to provide all functionality of the core via a simple 32-bit memory like interface. If the core is going to be used as a tightly integrated submodule, the wrapper can be discarded. Similarly, if the core is going to be used in a bus system that use a specific bus standard such as AMBA AHB, CoreConnect or WISHBONE, only the top level wrapper will be needed to be replaced or modified to match the desired bus standard.
The RTL code does not explicitly instantiate any hard macros such as
memories, multipliers, etc. Instead all such functions are left to the
synthesis tool to infer based on the code. All memories are placed in separate modules to allow easy modification of the design. In an ASIC setting any memories not automatically mapped will be replaced by instantiation of specific macros.
Some of the memories in the designs have combinational read (i.e the read
data is not locked by an output register, which infers a one cycle read
latency). For some FPGA technologies these memories are not compatible with the available physical memories. The synthesis tools therefor implement these memories
using separate registers rather than selecting a memory instance. In an ASIC
implementation these memories would likely become real memory macros to allow for a faster and more compact implementation.
External interfaces such as GPIO, Ethernet GMII, UART, etc., will always
require some modification for the Cryptech design to be implemented in a
given technology, whether it is a specific FPGA type or an ASIC. The
important thing is that the Cryptech design does not use technology
specific macros to implement the interfaces. But pin assignments,
timing, and electrical requirements will always require adjustment and
work.
The design style used in the Cryptech Verilog code currently follows the
guidelines from the FPGA vendors Altera and Xilinx. This means that we
use synchronous reset. For an ASIC implementation this will also work,
even though asynchronous reset is far more common in ASIC designs. Changing
to asynchronous reset is not a very big undertaking however, as the
register reset and update clocking are separated into easily
identifiable processes (reg_update) in the modules.
Most if not all registers in the Cryptech Verilog code have a defined
reset state. Most registers also have a write enable signal that
controls the update. This corresponds well with the registers available
in FPGA technologies from Altera and Xilinx and their recommended design strategy from the vendors. This is also in line with common
and good design styles for ASICs, which allows for compact code and low
power implementations. The design is currently not use any clock gating. In future revisions this might be added if power consumption needs to be reduced and does not add side channel issues.
The Cryptech hardware design will use external persistent memories for
protected key storage as well as external SRAM for protected master key
storage. In an ASIC implementation the master key memory would probably
be integrated to further enhance security.
Just like other external interfaces (see above), the interfaces for the
external memories do not use any explicitly instantiated hard macros in
the FPGAs.
The current Cryptech design contains two separate physical entropy
sources.
1: An avalanche noise based entropy source placed outside the FPGA. The
entropy source signal is sampled by the FPGA using a flank detection
mechanism.
An ASIC implementation would be able to use the external entropy source just like the FPGA. Furthermore, depending on the process options, it might be
possible to have an internal avalanche diode based on ESD structures commonly used in I/O pin implementations. In a power management capable process, functionality available in step-up converters might also be possible to use as internal avalanche noise source.
Note that integrating the avalanche noise source does not mean that an off-chip noise source is excluded. The Cryptech RNG is modular and having both an internal and an external avalanche noise source is quite possible.
2: A ring oscillator based entropy source placed inside the FPGA. The ring oscillator used in the FPGA is based on carry chain feedback through adders. An ASIC implementation of this ring oscillator should work and produce noise with similar characteristics. However the specific circuit will have to be characterized with explicit layout and qualified for the given process.
Crypech currently use Verilog simulators for functional verification and commercial FPGA tools for implementation including time analysis.
An ASIC implementation will require several new tools including tools for synthesis, place & route and static time analysis that is acceptable as sign-off tool by the chip process vendor.
The HW designed for the first iteration of Cryptech is not specifically
designed for FPGA implementation, but is in fact designed in a generic
way to allow for easy implementation using different technologies such
as ASICs.
There are however parts of the design that will have to be updated or
modified in order to create a good ASIC implementation. The Cryptech
project is confident that we know what those parts are and what they
would entail.
Developing an ASIC will however require new tools which will incur costs.