
Adds syscalls: - ALLOCATE_AREA - DEALLOCATE_AREA - WRITE_DATA - READ_DATA and code to access the filesystem and the flash over SPI. Based on original work by Daniel Jobson <jobson@tillitis.see> for these files: - auth_app.[ch] - flash.[ch] - spi.[ch] - partition_table.[ch] - rng.[ch] - storage.[ch] which are used with small changes to integrate with the new syscall method. Co-authored-by: Daniel Jobson <jobson@tillitis.se> Co-authored-by: Mikael Ågren <mikael@tillitis.se>
Firmware implementation notes
Introduction
This text is specific for the firmware, the piece of software in TKey ROM. For a more general description on how to implement device apps, see the TKey Developer Handbook.
Definitions
- Firmware: Software in ROM responsible for loading, measuring, and starting applications. The firmware is included as part of the FPGA bitstream and not replacable on a usual consumer TKey.
- Client: Software running on a computer or a mobile phone the TKey is inserted into.
- Device application or app: Software supplied by the client that runs on the TKey.
CPU modes and firmware
The TKey has two modes of software operation: firmware mode and application mode. The TKey always starts in firmware mode when it starts the firmware. When the application starts the hardware automatically switches to a more constrained environment: the application mode.
The TKey hardware cores are memory mapped but the memory access is different depending on mode. Firmware has complete access, except that the Unique Device Secret (UDS) words are readable only once even in firmware mode. The memory map is constrained when running in application mode, e.g. FW_RAM and UDS isn't readable, and several other hardware addresses are either not readable or not writable for the application.
When doing system calls from a device app the context switches back to firmware mode. However, the UDS is still not available, protected by two measures: 1) the UDS words can only be read out once and have already been read by firmware when measuring the app, and, 2) the UDS is protected by hardware after the execution leaves ROM for the first time.
See the table in the Developer Handbook for an overview about the memory access control.
Communication
The firmware communicates with the client using the
UART_{RX,TX}_{STATUS,DATA}
registers. On top of that is uses three
protocols: The USB Mode protocol, the TKey framing protocol, and the
firmware's own protocol.
To communicate between the CPU and the CH552 USB controller it uses an internal protocol, used only within the TKey, which we call the USB Mode Protocol. It is used in both directions.
Name | Size | Comment |
---|---|---|
Endpoint | 1B | Origin or destination USB endpoint |
Length | 1B | Number of bytes following |
Payload | See above | Actual data from or to firmware |
The different endpoints:
Name | Value | Comment |
---|---|---|
DEBUG | 0x20 | A USB HID special debug pipe. Useful for debug prints. |
CH552 | 0x10 | USB controller control |
CDC | 0x40 | USB CDC-ACM, a serial port on the client. |
FIDO | 0x80 | A USB FIDO security token device, useful for FIDO-type applications. |
You can turn on and off different endpoints dynamically by sending
commands to the CH552
control endpoint. When the TKey starts only
the CH552
and the CDC
endpoints are active. To change this, send a
command to CH552
in this form:
Name | Size | Comment |
---|---|---|
Command | 1B | Command to the CH552 firmware |
Payload | 1B | Data for the command |
Commands:
Name | Value | Argument |
---|---|---|
Enable endpoints | 0x01 | Bitmask of endpoints |
On top of the USB Mode Protocol is the TKey Framing Protocol which is described in the Developer Handbook.
The firmware uses a protocol on top of this framing layer which is used to bootstrap an application. All commands are initiated by the client. All commands receive a reply. See Firmware protocol in the Dev Handbook for specific details.
Memory constraints
Name | Size | FW mode | App mode |
---|---|---|---|
ROM | 8 kByte | r-x | r |
FW_RAM | 4 kByte* | rw- | - |
RAM | 128 kByte | rwx | rwx |
- FW_RAM is divided into the following areas:
- fw stack: 3824 bytes.
- resetinfo: 256 bytes.
- rest is available for .data and .bss.
Firmware behaviour
The purpose of the firmware is to load, measure, and start an application received from the client over the USB/UART.
The firmware binary is part of the FPGA bitstream as the initial
values of the Block RAMs used to construct the FW_ROM
. The FW_ROM
start address is located at 0x0000_0000
in the CPU memory map, which
is also the CPU reset vector.
Firmware state machine
This is the state diagram of the firmware. There are only six states.
Change of state occur when we receive specific I/O or a fatal error occurs.
stateDiagram-v2
S0: resetinfo
S4: loadflash
S1: waitcommand
S2: loading
S3: running
SE: failed
[*] --> S0
S0 --> S4: load 1 (def) or load 2
S0 --> S1
S1 --> S1: Commands
S1 --> S2: LOAD_APP
S1 --> SE: Error
S2 --> S2: LOAD_APP_DATA
S2 --> S3: Last block received
S2 --> SE: Error
S4 --> S3
S4 --> SE: Error
SE --> [*]
S3 --> [*]
States:
resetinfo
- We start by checking resetinfo data inFW_RAM
waitcommand
- Waiting for initial commands from client. Allows the commandsNAME_VERSION
,GET_UDI
,LOAD_APP
.loadflash
- Loading an app from flash.loading
- Expect application data. Allows only the commandLOAD_APP_DATA
.running
- Computes CDI and starts the application. Allows no commands.failed
- Halts CPU. Allows no commands.
Allowed data in state resetinfo
:
startfrom | next state |
---|---|
default |
loadflash |
Start flash slot 1 |
loadflash |
Start flash slot 2 |
loadflash |
Start from client |
waitcommand |
I/O in state loadflash
:
I/O | next state |
---|---|
Last app data read | run |
Commands in state waitcommand
:
command | next state |
---|---|
FW_CMD_NAME_VERSION |
unchanged |
FW_CMD_GET_UDI |
unchanged |
FW_CMD_LOAD_APP |
loading |
Commands in state loading
:
command | next state |
---|---|
FW_CMD_LOAD_APP_DATA |
unchanged or run on last chunk |
See Firmware protocol in the Dev Handbook for the definition of the specific commands and their responses.
Exection starts in state resetinfo
where the firmware checks in
FW_RAM
for what to do next.
State changes to loadflash
if the FW_RAM
data indicates
that it should start one of the two flash apps.
State changes to waitcommand
if the FW_RAM
data indicates that it
instead should wait for commands from a client.
In loadflash
state changes to running
if the app has been
successfully loaded into RAM or to failed
otherwise.
State changes from waitcommand
to loading
when receiving
LOAD_APP
, which also sets the size of the number of data blocks to
expect. After that we expect several LOAD_APP_DATA
commands until
the last block is received, when state is changed to running
.
In running
, the loaded device app is measured, the Compound Device
Identifier (CDI) is computed, we do some cleanup of firmware data
structures, enable the system calls, and finally start the app, which
ends the firmware state machine. Hardware guarantees that we leave
firmware mode automatically when the program counter leaves ROM.
The device app is now running in application mode. We can, however, return to firmware mode (excepting access to the UDS) by doing system calls. Note that ROM is still readable, but is now hardware protected from execution, except through the system call mechanism.
Golden path
Firmware loads the application at the start of RAM (0x4000_0000
)
from either flash or the UART. It use a part of the special FW_RAM
for its own stack.
When reset is released, the CPU starts executing the firmware. It
begins in start.S
by clearing all CPU registers, clears all FW_RAM,
sets up a stack for itself there, and then jumps to main()
. Also
included in the assembly part of firmware is an interrupt handler for
the system calls, but the handler is not yet enabled.
Beginning at main()
it fills the entire RAM with pseudo random data
and setting up the RAM address and data hardware scrambling with
values from the True Random Number Generator (TRNG).
-
Check the special resetinfo area in FW_RAM to see if there is any data about why a reset has been made. All zeroes(?) meaning default behaviour.
-
If it was reset with intention to start a device app from client, see App loaded from client below.
-
Default is to start the first device app from flash. If resetinfo says otherwise it starts the other one.
-
Load flash app into RAM without USS.
-
Compute digest of loaded app.
-
Compare against stored app digest in partition table to note if app has been corrupted on flash. If corrupted, halt CPU.
-
Proceed to Start the device app below.
If the app is the first set in a chain, it's the job of the app itself to reset the TKey when it has done its job. For instance, a verified boot loader app:
-
includes a security policy, for instance a public key and code to check a signature.
-
the app reads the message and the signature over the message (the digest of the next app in the chain) from the filesystem or from the client.
-
if the signature provided over the message is verified to be done by the corresponding private key, this app would do a
reset()
, passing the digest to the firmware for control and instructing it to start just that app. -
firmware would see the instructions about the reset in FW_RAM:
- Where to expect the next app from: client, a slot in the filesystem?
- The expected digest of the next app.
App loaded from client
Firmware waits for data coming in through the UART.
-
The client sends the
FW_CMD_LOAD_APP
command with the size of the device app and the optional 32 byte hash of the user-supplied secret as arguments and gets aFW_RSP_LOAD_APP
back. After using this it's not possible to restart the loading of an application. -
If the the client receive a sucessful response, it will send multiple
FW_CMD_LOAD_APP_DATA
commands, together containing the full application. -
On receiving
FW_CMD_LOAD_APP_DATA
commands the firmware places the data into0x4000_0000
and upwards. The firmware replies with aFW_RSP_LOAD_APP_DATA
response to the client for each received block except the last data block. -
When the final block of the application image is received with a
FW_CMD_LOAD_APP_DATA
, the firmware measure the application by computing a BLAKE2s digest over the entire application. Then firmware send back theFW_RSP_LOAD_APP_DATA_READY
response containing the digest.
Start the device app
-
If there is an app digest in the resetinfo left from previous app, compare the digests. Halt CPU if differences.
-
The Compound Device Identifier (CDI) is then computed by doing a new BLAKE2s using the Unique Device Secret (UDS), the application digest, and any User Supplied Secret (USS) digest already received.
-
The start address of the device app, currently
0x4000_0000
, is written toAPP_ADDR
and the size of the binary toAPP_SIZE
to let the device application know where it is loaded and how large it is, if it wants to relocate in RAM. -
The firmware now clears the part of the special
FW_RAM
where it keeps it stack. -
The interrupt handler for system calls is enabled.
-
Firmware starts the application by jumping to the contents of
APP_ADDR
. Hardware automatically switches from firmware mode to application mode. In this mode some memory access is restricted, e.g. some addresses are inaccessible (UDS
), and some are switched from read/write to read-only (see the memory map).
If during this whole time any commands are received which are not allowed in the current state, or any errors occur, we enter the "failed" state and execute an illegal instruction. An illegal instruction traps the CPU and hardware blinks the status LED red until a power cycle. No further instructions are executed.
User-supplied Secret (USS)
USS is a 32 bytes long secret provided by the user. Typically a client program gets a secret from the user and then does a key derivation function of some sort, for instance a BLAKE2s, to get 32 bytes which it sends to the firmware to be part of the CDI computation.
Compound Device Identifier computation
The CDI is computed by:
CDI = blake2s(UDS, blake2s(app), USS)
In an ideal world, software would never be able to read UDS at all and we would have a BLAKE2s function in hardware that would be the only thing able to read the UDS. Unfortunately, we couldn't fit a BLAKE2s implementation in the FPGA at this time.
The firmware instead does the CDI computation using the special
firmware-only FW_RAM
which is invisible after switching to app mode.
We keep the entire firmware stack in FW_RAM
and clear the stack just
before switching to app mode just in case.
We sleep for a random number of cycles before reading out the UDS,
call blake2s_update()
with it and then immediately call
blake2s_update()
again with the program digest, destroying the UDS
stored in the internal context buffer. UDS should now not be in
FW_RAM
anymore. We can read UDS only once per power cycle so UDS
should now not be available even to firmware.
Then we continue with the CDI computation by updating with an optional
USS digest and finalizing the hash, storing the resulting digest in
CDI
.
Firmware system calls
The firmware provides a system call mechanism through the use of the PicoRV32 interrupt handler. They are triggered by writing to the trigger address: 0xe1000000. It's typically done with a function signature like this:
int syscall(uint32_t number, uint32_t arg1);
Arguments are system call number and upto 6 generic arguments passed to the system call handler. The caller should place the system call number in the a0 register and the arguments in registers a1 to a7 according to the RISC-V calling convention. The caller is responsible for saving and restoring registers.
The syscall handler returns execution on the next instruction after the store instruction to the trigger address. The return value from the syscall is now available in x10 (a0).
To add or change syscalls, see the syscall_handler()
in
syscall_handler.c
.
Currently supported syscalls:
Name | Number | Argument | Description |
---|---|---|---|
RESET | 1 | Unused | Reset the TKey |
SET_LED | 10 | Colour | Set the colour of the status LED |
GET_VIDPID | 12 | Unused | Get Vendor and Product ID |
Developing firmware
Standing in hw/application_fpga/
you can run make firmware.elf
to
build just the firmware. You don't need all the FPGA development
tools. See the Developer Handbook
for the tools you need. The easiest is probably to use our OCI image,
ghcr.io/tillitis/tkey-builder
.
Our version of qemu is also useful for debugging the firmware. You can attach GDB, use breakpoints, et cetera.
There is a special make target for QEMU: qemu_firmware.elf
, which
sets -DQEMU_DEBUG
, so you can debug prints using the debug_*()
functions. Note that these functions are only usable in QEMU and that
you might need to make clean
before building, if you have already
built before.
If you want debug prints to show up on the special TKey HID debug
endpoint instead, define -DTKEY_DEBUG
.
Note that if you use TKEY_DEBUG
you must have something listening
on the corresponding HID device. It's usually the last HID device
created. On Linux, for instance, this means the last reported hidraw
in dmesg
is the one you should do cat /dev/hidrawX
on.
tkey-libs
Most of the utility functions that the firmware use lives in
tkey-libs
. The canonical place where you can find tkey-libs is at:
https://github.com/tillitis/tkey-libs
but we have vendored it in for firmware use in ../tkey-libs
. See top
README for how to update.
Test firmware
The test firmware is in testfw
. It's currently a bit of a hack and
just runs through expected behavior of the hardware cores, giving
special focus to access control in firmware mode and application mode.
It outputs results on the UART. This means that you have to attach a terminal program to the serial port device, even if it's running in qemu. It waits for you to type a character before starting the tests.
It needs to be compiled with -Os
instead of -O2
in CFLAGS
in the
ordinary application_fpga/Makefile
to be able to fit in the 6 kByte
ROM.