mirror of
https://github.com/tillitis/tillitis-key1.git
synced 2025-04-26 18:09:16 -04:00
481 lines
18 KiB
Markdown
481 lines
18 KiB
Markdown
# Firmware implementation notes
|
|
|
|
## Introduction
|
|
|
|
This text is specific for the firmware, the piece of software in TKey
|
|
ROM. For a more general description on how to implement device apps,
|
|
see [the TKey Developer Handbook](https://dev.tillitis.se/).
|
|
|
|
## Definitions
|
|
|
|
- Firmware: Software in ROM responsible for loading, measuring, and
|
|
starting applications. The firmware is included as part of the FPGA
|
|
bitstream and not replacable on a usual consumer TKey.
|
|
- Client: Software running on a computer or a mobile phone the TKey is
|
|
inserted into.
|
|
- Device application or app: Software supplied by the client that runs
|
|
on the TKey.
|
|
|
|
## CPU modes and firmware
|
|
|
|
The TKey has two modes of software operation: firmware mode and
|
|
application mode. The TKey always starts in firmware mode when it
|
|
starts the firmware. When the application starts the hardware
|
|
automatically switches to a more constrained environment: the
|
|
application mode.
|
|
|
|
The TKey hardware cores are memory mapped but the memory access is
|
|
different depending on mode. Firmware has complete access, except that
|
|
the Unique Device Secret (UDS) words are readable only once even in
|
|
firmware mode. The memory map is constrained when running in
|
|
application mode, e.g. FW\_RAM and UDS isn't readable, and several
|
|
other hardware addresses are either not readable or not writable for
|
|
the application.
|
|
|
|
When doing system calls from a device app the context switches back to
|
|
firmware mode. However, the UDS is still not available, protected by
|
|
two measures: 1) the UDS words can only be read out once and have
|
|
already been read by firmware when measuring the app, and, 2) the UDS
|
|
is protected by hardware after the execution leaves ROM for the first
|
|
time.
|
|
|
|
See the table in [the Developer
|
|
Handbook](https://dev.tillitis.se/memory/) for an overview about the
|
|
memory access control.
|
|
|
|
## Communication
|
|
|
|
The firmware communicates with the client using the
|
|
`UART_{RX,TX}_{STATUS,DATA}` registers. On top of that is uses three
|
|
protocols: The USB Mode protocol, the TKey framing protocol, and the
|
|
firmware's own protocol.
|
|
|
|
To communicate between the CPU and the CH552 USB controller it uses an
|
|
internal protocol, used only within the TKey, which we call the USB
|
|
Mode Protocol. It is used in both directions.
|
|
|
|
| *Name* | *Size* | *Comment* |
|
|
|----------|-----------|------------------------------------|
|
|
| Endpoint | 1B | Origin or destination USB endpoint |
|
|
| Length | 1B | Number of bytes following |
|
|
| Payload | See above | Actual data from or to firmware |
|
|
|
|
The different endpoints:
|
|
|
|
| *Name* | *Value* | *Comment* |
|
|
|--------|---------|---------------------------------------------------------------------|
|
|
| CTRL | 0x20 | A USB HID special debug pipe. Useful for debug prints. |
|
|
| CDC | 0x40 | USB CDC-ACM, a serial port on the client. |
|
|
| HID | 0x80 | A USB HID security token device, useful for FIDO-type applications. |
|
|
|
|
On top of the USB Mode Protocol is [the TKey Framing
|
|
Protocol](https://dev.tillitis.se/protocol/) which is described in the
|
|
Developer Handbook.
|
|
|
|
The firmware uses a protocol on top of this framing layer which is
|
|
used to bootstrap an application. All commands are initiated by the
|
|
client. All commands receive a reply. See [Firmware
|
|
protocol](http://dev.tillitis.se/protocol/#firmware-protocol) in the
|
|
Dev Handbook for specific details.
|
|
|
|
## Memory constraints
|
|
|
|
| *Name* | *Size* | *FW mode* | *App mode* |
|
|
|---------|-----------|-----------|------------|
|
|
| ROM | 8 kByte | r-x | r |
|
|
| FW\_RAM | 4 kByte* | rw- | - |
|
|
| RAM | 128 kByte | rwx | rwx |
|
|
|
|
* FW\_RAM is divided into the following areas:
|
|
|
|
- fw stack: 3824 bytes.
|
|
- resetinfo: 256 bytes.
|
|
- rest is available for .data and .bss.
|
|
|
|
## Firmware behaviour
|
|
|
|
The purpose of the firmware is to load, measure, and start an
|
|
application received from the client over the USB/UART.
|
|
|
|
The firmware binary is part of the FPGA bitstream as the initial
|
|
values of the Block RAMs used to construct the `FW_ROM`. The `FW_ROM`
|
|
start address is located at `0x0000_0000` in the CPU memory map, which
|
|
is also the CPU reset vector.
|
|
|
|
### Reset intentions
|
|
|
|
We have a number of reset options we call `startfrom`:
|
|
|
|
1. Start from flash slot 1 (default): `FLASH1`
|
|
2. Start from flash slot 2: `FLASH2`.
|
|
3. Load and start an app from flash slot 1 with a specific app hash:
|
|
`FLASH1_VER`
|
|
4. Load and start an app from flash slot 2 with a specific app hash:
|
|
`FLASH2_VER`.
|
|
5. Load and start a new app from client: `CLIENT`.
|
|
6. load and start an app from client with a specific app hash:
|
|
`CLIENT_VER`.
|
|
|
|
### Firmware state machine
|
|
|
|
This is the state diagram of the firmware. Change of state occur when
|
|
we receive specific I/O or a fatal error occurs.
|
|
|
|
```mermaid
|
|
stateDiagram-v2
|
|
S0: initial
|
|
S1: waitcommand
|
|
S2: loading
|
|
S3: flash_loading
|
|
S4: auth_app
|
|
S5: starting
|
|
S6: compute_cdi
|
|
SE: failed
|
|
|
|
[*] --> S0
|
|
|
|
S0 --> S1
|
|
S0 --> S4: Default
|
|
|
|
S1 --> S1: Commands
|
|
S1 --> S2: LOAD_APP
|
|
S1 --> SE: Error
|
|
|
|
S2 --> S2: LOAD_APP_DATA
|
|
S2 --> S6: Last block received
|
|
S2 --> SE: Error
|
|
|
|
S6 --> S3
|
|
|
|
S3 --> S5
|
|
|
|
S4 --> S5
|
|
S4 --> SE: Error
|
|
|
|
SE --> [*]
|
|
S5 --> [*]
|
|
```
|
|
|
|
States:
|
|
|
|
- `initial`: We start by checking resetinfo data in `FW_RAM` for
|
|
`startfrom`.
|
|
- `waitcommand`: Waiting for initial commands from client. Allows the
|
|
commands `NAME_VERSION`, `GET_UDI`, `LOAD_APP`.
|
|
- `loading`: Expecting application data from client. Allows only the
|
|
command `LOAD_APP_DATA`.
|
|
- `flash_loading`: Loading and authentication app from flash. Computes CDI,
|
|
creates or checks the authentication of the flash app. Allows no commands.
|
|
- `starting`: Starts the application. Does not return to firmware.
|
|
Allows no commands.
|
|
- `failed` - Halts CPU. Allows no commands.
|
|
|
|
Allowed data in state `resetinfo`:
|
|
|
|
| *startfrom* | *next state* |
|
|
|--------------|-----------------|
|
|
| `FLASH1` | `flash_loading` |
|
|
| `FLASH2` | `flash_loading` |
|
|
| `FLASH1_VER` | `flash_loading` |
|
|
| `FLASH2_VER` | `flash_loading` |
|
|
| `CLIENT` | `waitcommand` |
|
|
| `CLIENT_VER` | `waitcommand` |
|
|
|
|
I/O in state `flash_loading`:
|
|
|
|
| *I/O* | *next state* |
|
|
|--------------------|--------------|
|
|
| Last app data read | `starting` |
|
|
|
|
Commands in state `waitcommand`:
|
|
|
|
| *command* | *next state* |
|
|
|-----------------------|--------------|
|
|
| `FW_CMD_NAME_VERSION` | unchanged |
|
|
| `FW_CMD_GET_UDI` | unchanged |
|
|
| `FW_CMD_LOAD_APP` | `loading` |
|
|
|
|
Commands in state `loading`:
|
|
|
|
| *command* | *next state* |
|
|
|------------------------|---------------------------------------|
|
|
| `FW_CMD_LOAD_APP_DATA` | unchanged or `starting` on last chunk |
|
|
|
|
No other states allows commands.
|
|
|
|
See [Firmware protocol in the Dev
|
|
Handbook](http://dev.tillitis.se/protocol/#firmware-protocol) for the
|
|
definition of the specific commands and their responses.
|
|
|
|
Plain text explanation of the states:
|
|
|
|
- `initial`: Execution starts here. The firmware checks in the
|
|
`FW_RAM` for `startfrom` for what to do next.
|
|
|
|
For all `startfrom` values `FLASH_*` the next state is `startflash`.
|
|
Otherwise it goes to `waitcommand`, indicating that it should wait
|
|
for further commands from the client.
|
|
|
|
- `flash_loading` loads and measure an app from flash, the Compound
|
|
Device Identifier (CDI) is computed, then the app is authenticated
|
|
against a stored digest to see that no one has changed the app by
|
|
manipulating the flash. The compuation is done by:
|
|
|
|
digest = blake2s(cdi, nonce from flash)
|
|
|
|
and then compared against the stored digest in the app's flash slot.
|
|
|
|
- `waitcommand` waits for command from the client. State changes to
|
|
`loading` when receiving `LOAD_APP`, which also sets the size of the
|
|
number of data blocks to expect. After that we expect several
|
|
`LOAD_APP_DATA` commands until the last block is received, when
|
|
state is changed to `running`.
|
|
|
|
- `compute_cdi`: The the Compound Device Identifier (CDI) is computed
|
|
and we go to `starting`.
|
|
|
|
- `starting`: Clean up firmware data structures, enable the system
|
|
calls, and start the app, which ends the firmware state machine.
|
|
Hardware guarantees that we leave firmware mode automatically when
|
|
the program counter leaves ROM.
|
|
|
|
After `starting` the device app is now running in application mode. We
|
|
can, however, return to firmware mode (excepting access to the UDS) by
|
|
doing system calls. Note that ROM is still readable, but is now
|
|
hardware protected from execution, except through the system call
|
|
mechanism.
|
|
|
|
### Golden path
|
|
|
|
Firmware loads the application at the start of RAM (`0x4000_0000`)
|
|
from either flash or the UART. It use a part of the special FW\_RAM
|
|
for its own stack.
|
|
|
|
When reset is released, the CPU starts executing the firmware. It
|
|
begins in `start.S` by clearing all CPU registers, clears all FW\_RAM,
|
|
sets up a stack for itself there, and then jumps to `main()`. Also
|
|
included in the assembly part of firmware is an interrupt handler for
|
|
the system calls, but the handler is not yet enabled.
|
|
|
|
Beginning at `main()` it fills the entire RAM with pseudo random data
|
|
and setting up the RAM address and data hardware scrambling with
|
|
values from the True Random Number Generator (TRNG).
|
|
|
|
1. Check the special resetinfo area in FW\_RAM to see if there is any
|
|
data about why a reset has been made. All zeroes(?) meaning default
|
|
behaviour.
|
|
|
|
2. If it was reset with intention to start a device app from client,
|
|
see App loaded from client below.
|
|
|
|
3. Default is to start the first device app from flash. If resetinfo
|
|
says otherwise it starts the other one.
|
|
|
|
4. Load flash app into RAM without USS.
|
|
|
|
5. Compute digest of loaded app.
|
|
|
|
6. Compare against stored app digest in partition table to note if app
|
|
has been corrupted on flash. If corrupted, halt CPU.
|
|
|
|
7. Proceed to [Start the device app](#start-the-device-app) below.
|
|
|
|
If the app is the first set in a chain, it's the job of the app itself
|
|
to reset the TKey when it has done its job. For instance, a verified
|
|
boot loader app:
|
|
|
|
- includes a security policy, for instance a public key and code to
|
|
check a signature.
|
|
|
|
- the app reads the message and the signature over the message (the
|
|
digest of the next app in the chain) from the filesystem or from
|
|
the client.
|
|
|
|
- if the signature provided over the message is verified to be done
|
|
by the corresponding private key, this app would do a `reset()`,
|
|
passing the digest to the firmware for control and instructing it
|
|
to start *just* that app.
|
|
|
|
- firmware would see the instructions about the reset in FW\_RAM:
|
|
|
|
1. Where to expect the next app from: client, a slot in the
|
|
filesystem?
|
|
2. The expected digest of the next app.
|
|
|
|
#### App loaded from client
|
|
|
|
Firmware waits for data coming in through the UART.
|
|
|
|
1. The client sends the `FW_CMD_LOAD_APP` command with the size of
|
|
the device app and the optional 32 byte hash of the user-supplied
|
|
secret as arguments and gets a `FW_RSP_LOAD_APP` back. After
|
|
using this it's not possible to restart the loading of an
|
|
application.
|
|
|
|
2. If the the client receive a sucessful response, it will send
|
|
multiple `FW_CMD_LOAD_APP_DATA` commands, together containing the
|
|
full application.
|
|
|
|
3. On receiving`FW_CMD_LOAD_APP_DATA` commands the firmware places
|
|
the data into `0x4000_0000` and upwards. The firmware replies
|
|
with a `FW_RSP_LOAD_APP_DATA` response to the client for each
|
|
received block except the last data block.
|
|
|
|
4. When the final block of the application image is received with a
|
|
`FW_CMD_LOAD_APP_DATA`, the firmware measure the application by
|
|
computing a BLAKE2s digest over the entire application. Then
|
|
firmware send back the `FW_RSP_LOAD_APP_DATA_READY` response
|
|
containing the digest.
|
|
|
|
#### Start the device app
|
|
|
|
1. If there is an app digest in the resetinfo left from previous app,
|
|
compare the digests. Halt CPU if differences.
|
|
|
|
2. The Compound Device Identifier
|
|
([CDI]((#compound-device-identifier-computation))) is then computed
|
|
by doing a new BLAKE2s using the Unique Device Secret (UDS), the
|
|
application digest, and any User Supplied Secret (USS) digest
|
|
already received.
|
|
|
|
3. The start address of the device app, currently `0x4000_0000`, is
|
|
written to `APP_ADDR` and the size of the binary to `APP_SIZE` to
|
|
let the device application know where it is loaded and how large it
|
|
is, if it wants to relocate in RAM.
|
|
|
|
4. The firmware now clears the part of the special `FW_RAM` where it
|
|
keeps it stack.
|
|
|
|
5. The interrupt handler for system calls is enabled.
|
|
|
|
6. Firmware starts the application by jumping to the contents of
|
|
`APP_ADDR`. Hardware automatically switches from firmware mode to
|
|
application mode. In this mode some memory access is restricted,
|
|
e.g. some addresses are inaccessible (`UDS`), and some are switched
|
|
from read/write to read-only (see [the memory
|
|
map](https://dev.tillitis.se/memory/)).
|
|
|
|
If during this whole time any commands are received which are not
|
|
allowed in the current state, or any errors occur, we enter the
|
|
"failed" state and execute an illegal instruction. An illegal
|
|
instruction traps the CPU and hardware blinks the status LED red until
|
|
a power cycle. No further instructions are executed.
|
|
|
|
### User-supplied Secret (USS)
|
|
|
|
USS is a 32 bytes long secret provided by the user. Typically a client
|
|
program gets a secret from the user and then does a key derivation
|
|
function of some sort, for instance a BLAKE2s, to get 32 bytes which
|
|
it sends to the firmware to be part of the CDI computation.
|
|
|
|
### Compound Device Identifier computation
|
|
|
|
The CDI is computed by:
|
|
|
|
```
|
|
CDI = blake2s(UDS, blake2s(app), USS)
|
|
```
|
|
|
|
In an ideal world, software would never be able to read UDS at all and
|
|
we would have a BLAKE2s function in hardware that would be the only
|
|
thing able to read the UDS. Unfortunately, we couldn't fit a BLAKE2s
|
|
implementation in the FPGA at this time.
|
|
|
|
The firmware instead does the CDI computation using the special
|
|
firmware-only `FW_RAM` which is invisible after switching to app mode.
|
|
We keep the entire firmware stack in `FW_RAM` and clear the stack just
|
|
before switching to app mode just in case.
|
|
|
|
We sleep for a random number of cycles before reading out the UDS,
|
|
call `blake2s_update()` with it and then immediately call
|
|
`blake2s_update()` again with the program digest, destroying the UDS
|
|
stored in the internal context buffer. UDS should now not be in
|
|
`FW_RAM` anymore. We can read UDS only once per power cycle so UDS
|
|
should now not be available even to firmware.
|
|
|
|
Then we continue with the CDI computation by updating with an optional
|
|
USS digest and finalizing the hash, storing the resulting digest in
|
|
`CDI`.
|
|
|
|
### Firmware system calls
|
|
|
|
The firmware provides a system call mechanism through the use of the
|
|
PicoRV32 interrupt handler. They are triggered by writing to the
|
|
trigger address: 0xe1000000. It's typically done with a function
|
|
signature like this:
|
|
|
|
```
|
|
int syscall(uint32_t number, uint32_t arg1);
|
|
```
|
|
|
|
Arguments are system call number and upto 6 generic arguments passed
|
|
to the system call handler. The caller should place the system call
|
|
number in the a0 register and the arguments in registers a1 to a7
|
|
according to the RISC-V calling convention. The caller is responsible
|
|
for saving and restoring registers.
|
|
|
|
The syscall handler returns execution on the next instruction after
|
|
the store instruction to the trigger address. The return value from
|
|
the syscall is now available in x10 (a0).
|
|
|
|
To add or change syscalls, see the `syscall_handler()` in
|
|
`syscall_handler.c`.
|
|
|
|
Currently supported syscalls:
|
|
|
|
| *Name* | *Number* | *Argument* | *Description* |
|
|
|-------------|----------|------------|----------------------------------|
|
|
| RESET | 1 | Unused | Reset the TKey |
|
|
| SET\_LED | 10 | Colour | Set the colour of the status LED |
|
|
| GET\_VIDPID | 12 | Unused | Get Vendor and Product ID |
|
|
|
|
## Developing firmware
|
|
|
|
Standing in `hw/application_fpga/` you can run `make firmware.elf` to
|
|
build just the firmware. You don't need all the FPGA development
|
|
tools. See [the Developer Handbook](https://dev.tillitis.se/tools/)
|
|
for the tools you need. The easiest is probably to use our OCI image,
|
|
`ghcr.io/tillitis/tkey-builder`.
|
|
|
|
[Our version of qemu](https://dev.tillitis.se/tools/#qemu-emulator) is
|
|
also useful for debugging the firmware. You can attach GDB, use
|
|
breakpoints, et cetera.
|
|
|
|
There is a special make target for QEMU: `qemu_firmware.elf`, which
|
|
sets `-DQEMU_DEBUG`, so you can debug prints using the `debug_*()`
|
|
functions. Note that these functions are only usable in QEMU and that
|
|
you might need to `make clean` before building, if you have already
|
|
built before.
|
|
|
|
If you want debug prints to show up on the special TKey HID debug
|
|
endpoint instead, define `-DTKEY_DEBUG`.
|
|
|
|
Note that if you use `TKEY_DEBUG` you *must* have something listening
|
|
on the corresponding HID device. It's usually the last HID device
|
|
created. On Linux, for instance, this means the last reported hidraw
|
|
in `dmesg` is the one you should do `cat /dev/hidrawX` on.
|
|
|
|
### tkey-libs
|
|
|
|
Most of the utility functions that the firmware use lives in
|
|
`tkey-libs`. The canonical place where you can find tkey-libs is at:
|
|
|
|
https://github.com/tillitis/tkey-libs
|
|
|
|
but we have vendored it in for firmware use in `../tkey-libs`. See top
|
|
README for how to update.
|
|
|
|
### Test firmware
|
|
|
|
The test firmware is in `testfw`. It's currently a bit of a hack and
|
|
just runs through expected behavior of the hardware cores, giving
|
|
special focus to access control in firmware mode and application mode.
|
|
|
|
It outputs results on the UART. This means that you have to attach a
|
|
terminal program to the serial port device, even if it's running in
|
|
qemu. It waits for you to type a character before starting the tests.
|
|
|
|
It needs to be compiled with `-Os` instead of `-O2` in `CFLAGS` in the
|
|
ordinary `application_fpga/Makefile` to be able to fit in the 6 kByte
|
|
ROM.
|