mirror of
https://github.com/tillitis/tillitis-key1.git
synced 2024-10-01 01:45:38 -04:00
doc: Update and expand firmware README
- Remove all text about other software than firmware. - Remove the Reset section. - Include a diagram and detailed explanation about the state machine in close vicinity. - Describe the test firmware. Co-authored-by: Joachim Strömbergson <joachim@assured.se>
This commit is contained in:
parent
cc16c8481c
commit
f1534e5dad
@ -1,201 +1,103 @@
|
||||
# Tillitis TKey software
|
||||
|
||||
**NOTE:** Documentation migrated to dev.tillitis.se, this is kept for
|
||||
history. This is likely to be outdated.
|
||||
# Firmware
|
||||
|
||||
## Introduction
|
||||
|
||||
This text is both an introduction to and a requirement specification
|
||||
of the TKey firmware, its protocol, and an overview of how TKey
|
||||
applications are supposed to work. For an overview of the TKey
|
||||
concepts, see [System Description](system_description.md).
|
||||
This text is an introduction to, a requirement specification of,
|
||||
and some implementation notes of the TKey firmware. It also gives a
|
||||
few hint on developing and debugging the firmware.
|
||||
|
||||
First, some definitions:
|
||||
This text is specific for the firmware. For a more general description
|
||||
on how to implement device apps, see [the TKey Developer
|
||||
Handbook](https://dev.tillitis.se/).
|
||||
|
||||
## Definitions
|
||||
|
||||
- Firmware - software in ROM responsible for loading applications. The
|
||||
firmware is included as part of the FPGA bit stream.
|
||||
- Application or app - software supplied by the host machine which is
|
||||
firmware is included as part of the FPGA bitstream and not
|
||||
replacable on a usual consumer TKey.
|
||||
- Device application or app - software supplied by the client which is
|
||||
received, loaded, measured, and started by the firmware.
|
||||
|
||||
## CPU modes and firmware
|
||||
|
||||
The TKey has two modes of software operation: firmware mode and
|
||||
application mode. The firmware mode has the responsibility of
|
||||
receiving, measuring, loading, and starting the application. When the
|
||||
firmware is about to start the application it switches to a more
|
||||
constrained environment, the application mode.
|
||||
application mode. The TKey always starts in firmware mode and starts
|
||||
the firmware. When the firmware is about to start the application it
|
||||
switches to a more constrained environment, the application mode.
|
||||
|
||||
The firmware and application uses a memory mapped input/output (MMIO)
|
||||
for communication with the hardware. The memory map is constrained
|
||||
when running in application mode, e.g. FW-RAM and UDS isn't readable,
|
||||
and several MMIO addresses are either not readable or not writable for
|
||||
the application.
|
||||
The TKey hardware cores are memory mapped. Firmware has complete
|
||||
access, except that the UDS is readable only once. The memory map is
|
||||
constrained when running in application mode, e.g. FW\_RAM and UDS
|
||||
isn't readable, and several other hardware addresses are either not
|
||||
readable or not writable for the application.
|
||||
|
||||
See table in the [System
|
||||
Description](system_description.md#memory-mapped-hardware-functions)
|
||||
for details about access rules control in the memory system and MMIO.
|
||||
See the table in [the Developer
|
||||
Handbook](https://dev.tillitis.se/memory/) for an overview about the
|
||||
memory access control.
|
||||
|
||||
The firmware (and optionally all software) on the TKey can communicate
|
||||
to the host via the `UART_{RX,TX}_{STATUS,DATA}` registers, using the
|
||||
framing protocol described in the [Framing
|
||||
## Communication
|
||||
|
||||
The firmware communicates to the host via the
|
||||
`UART_{RX,TX}_{STATUS,DATA}` registers, using the framing protocol
|
||||
described in the [Framing
|
||||
Protocol](https://dev.tillitis.se/protocol/).
|
||||
|
||||
The firmware defines a protocol on top of this framing layer which is
|
||||
The firmware uses a protocol on top of this framing layer which is
|
||||
used to bootstrap the application. All commands are initiated by the
|
||||
host. All commands receive a reply. See [Firmware
|
||||
client. All commands receive a reply. See [Firmware
|
||||
protocol](#firmware-protocol) for specific details.
|
||||
|
||||
Applications define their own protocol used for communication with
|
||||
their host part. They may or may not be based on the Framing Protocol.
|
||||
|
||||
## CPU
|
||||
|
||||
The CPU is a PicoRV32, a 32-bit RISC-V processor (arch: RV32IC\_Zmmul)
|
||||
which runs the firmware and the application. The firmware and
|
||||
application both run in RISC-V machine mode. All types are
|
||||
little-endian.
|
||||
|
||||
## Constraints
|
||||
## Memory constraints
|
||||
|
||||
- ROM: 6 kByte.
|
||||
- FW\_RAM: 2 kByte.
|
||||
- RAM: 128 kByte.
|
||||
|
||||
## Firmware
|
||||
## Firmware behaviour
|
||||
|
||||
The purpose of the firmware is to bootstrap and measure an
|
||||
application.
|
||||
The purpose of the firmware is to load, measure, and start an
|
||||
application received from the client over the USB/UART.
|
||||
|
||||
The TKey has 128 kilobyte RAM. Firmware loads the application at the
|
||||
start of RAM. The current C runtime (`crt0.S`) of apps in our [apps
|
||||
repo](https://github.com/tillitis/tillitis-key1-apps) sets up the
|
||||
stack to start just below the end of RAM. This means that a larger app
|
||||
comes at the compromise of it having a smaller stack.
|
||||
|
||||
The firmware is part of FPGA bitstream (ROM), and is loaded at
|
||||
`0x0000_0000`.
|
||||
|
||||
When the firmware starts it clears all RAM and then wait for commands
|
||||
coming in on `UART_RX`.
|
||||
|
||||
Typical use scenario:
|
||||
|
||||
1. The host sends the `FW_CMD_LOAD_APP` command with the size of the
|
||||
device app and the optional user-supplied secret as arguments and
|
||||
and gets a `FW_RSP_LOAD_APP` back. After using this it's not
|
||||
possible to restart the loading of an application.
|
||||
2. If the the host receive a sucessful response, it will send
|
||||
multiple `FW_CMD_LOAD_APP_DATA` commands, together containing the
|
||||
full application.
|
||||
3. On receiving`FW_CMD_LOAD_APP_DATA` commands the firmware places
|
||||
the data into `0x4000_0000` and upwards. The firmware replies
|
||||
with a `FW_RSP_LOAD_APP_DATA` response to the host for each
|
||||
received block except the last data block.
|
||||
4. When the final block of the application image is received with a
|
||||
`FW_CMD_LOAD_APP_DATA`, the firmware measure the application by
|
||||
computing a BLAKE2s digest over the entire application. Then
|
||||
firmware send back the `FW_RSP_LOAD_APP_DATA_READY` response
|
||||
containing the measurement.
|
||||
5. The Compound Device Identifier (CDI) is then computed by using
|
||||
the `UDS`, application digest, and the `USS`, and placed in
|
||||
`CDI`. (see [Compound Device Identifier
|
||||
computation](#compound-device-identifier-computation)) Then the
|
||||
start address of the device app, `0x4000_0000`, is written to
|
||||
`APP_ADDR` and the size to `APP_SIZE` to let the device
|
||||
application know where it is loaded and how large it is, if it
|
||||
wants to relocate in RAM.
|
||||
6. The firmware now clears the special `FW_RAM` where it keeps it
|
||||
stack. After this it does no more function calls and uses no more
|
||||
automatic variables.
|
||||
7. Firmware starts the application by first switching to application
|
||||
mode by writing to the `SWITCH_APP` register. In this mode the
|
||||
MMIO region is restricted, e.g. some registers are removed
|
||||
(`UDS`), and some are switched from read/write to read-only (see
|
||||
[memory
|
||||
map](system_description.md#memory-mapped-hardware-functions)).
|
||||
|
||||
Then the firmware jumps to what's in `APP_ADDR` which starts
|
||||
the application.
|
||||
|
||||
There is now no other means of getting back from application mode
|
||||
to firmware mode than resetting/power cycling the device.
|
||||
|
||||
### Developing firmware
|
||||
|
||||
Standing in `hw/application_fpga/` you can run `make firmware.elf` to
|
||||
build just the firmware. You don't need all the FPGA development tools
|
||||
mentioned in [Toolchain setup](../toolchain_setup.md).
|
||||
|
||||
You need clang with 32 RISC-V support `-march=rv32iczmmul` which comes
|
||||
in clang 15. If you don't have version 15 you might get by with
|
||||
`-march=rv32imc` but things will break if you ever cause it to emit
|
||||
`div` instructions.
|
||||
|
||||
You also need `llvm-ar`, `llvm-objcopy`, `llvm-size` and `lld`.
|
||||
Typically these are all available in packages called "clang", "llvm",
|
||||
"lld" or similar.
|
||||
|
||||
If your available `objcopy` and `size` commands is anything other than
|
||||
the default `llvm-objcopy` and `llvm-size` define `OBJCOPY` and `SIZE`
|
||||
to whatever they're called on your system before calling `make
|
||||
firmware.elf`.
|
||||
|
||||
If you want to use our emulator, clone the `tk1` branch of [our
|
||||
version of qemu](https://github.com/tillitis/qemu) and build:
|
||||
|
||||
```
|
||||
$ git clone -b tk1 https://github.com/tillitis/qemu
|
||||
$ mkdir qemu/build
|
||||
$ cd qemu/build
|
||||
$ ../configure --target-list=riscv32-softmmu --disable-werror
|
||||
$ make -j $(nproc)
|
||||
```
|
||||
|
||||
(Built with warnings-as-errors disabled, see [this
|
||||
issue](https://github.com/tillitis/qemu/issues/3).)
|
||||
|
||||
Run it like this:
|
||||
|
||||
```
|
||||
$ /path/to/qemu/build/qemu-system-riscv32 -nographic -M tk1,fifo=chrid -bios firmware.elf \
|
||||
-chardev pty,id=chrid
|
||||
```
|
||||
|
||||
This attaches the FIFO to a tty, something like `/dev/pts/16` which
|
||||
you can use with host software to talk to the firmware.
|
||||
|
||||
To quit QEMU you can use: `Ctrl-a x` (see `Ctrl-a ?` for other commands).
|
||||
|
||||
Debugging? Use the HTIF console by removing `-DNOCONSOLE` from the
|
||||
`CFLAGS` and using the helper functions in `lib.c` like `htif_puts()`
|
||||
`htif_putinthex()` `htif_hexdump()` and friends for printf-like
|
||||
debugging.
|
||||
|
||||
You can add `-d guest_errors` to the qemu commandline to make QEMU
|
||||
send errors from the TK1 machine to stderr, typically things like
|
||||
memory writes outside of mapped regions.
|
||||
|
||||
You can also use the qemu monitor for debugging, e.g. `info
|
||||
registers`, or run qemu with `-d in_asm` or `-d trace:riscv_trap`.
|
||||
|
||||
### Reset
|
||||
|
||||
The PicoRV32 starts executing at `0x0000_0000`. We allow no `.data` or
|
||||
`.bss` sections. Our firmware starts at the `_start` symbol in
|
||||
`start.S` which first clears all RAM for any remaining data from
|
||||
previous applications, then initializes a stack starting at the top of
|
||||
`FW_RAM` at `0xd000_0800` and downwards and then calls `main`.
|
||||
|
||||
When the initialization is finished, the firmware waits for incoming
|
||||
commands from the host, by busy-polling the `UART_RX_{STATUS,DATA}`
|
||||
registers. When a complete command is read, the firmware executes the
|
||||
command.
|
||||
The firmware binary is part of the FPGA bitstream as the initial
|
||||
values of the Block RAMs used to construct the `FW_ROM`. The `FW_ROM`
|
||||
start address is located at `0x0000_0000` in the CPU memory map, which
|
||||
is the CPU reset vector.
|
||||
|
||||
### Firmware state machine
|
||||
|
||||
This is the state diagram of the firmware. There are only four states.
|
||||
Change of state occur when we receive specific I/O or a fatal error
|
||||
occurs.
|
||||
|
||||
```mermaid
|
||||
stateDiagram-v2
|
||||
S1: initial
|
||||
S2: loading
|
||||
S3: running
|
||||
SE: failed
|
||||
|
||||
[*] --> S1
|
||||
|
||||
S1 --> S1: Commands
|
||||
S1 --> S2: LOAD_APP
|
||||
S1 --> SE: Error
|
||||
|
||||
S2 --> S2: LOAD_APP_DATA
|
||||
S2 --> S3: Last block received
|
||||
S2 --> SE: Error
|
||||
|
||||
S3 --> [*]
|
||||
```
|
||||
|
||||
States:
|
||||
|
||||
- `initial` - At start.
|
||||
- `loading` - Expect application data.
|
||||
- `run` - Computes CDI and starts the application.
|
||||
- `fail` - Stops waiting for commands, flashes LED forever.
|
||||
- `initial` - At start. Allows the commands `NAME_VERSION`, `GET_UDI`,
|
||||
`LOAD_APP`.
|
||||
- `loading` - Expect application data. Allows only the command
|
||||
`LOAD_APP_DATA`.
|
||||
- `run` - Computes CDI and starts the application. Allows no commands.
|
||||
- `fail` - Stops waiting for commands, flashes LED forever. Allows no
|
||||
commands.
|
||||
|
||||
Commands in state `initial`:
|
||||
|
||||
@ -212,17 +114,98 @@ Commands in state `loading`:
|
||||
|------------------------|----------------------------------|
|
||||
| `FW_CMD_LOAD_APP_DATA` | unchanged or `run` on last chunk |
|
||||
|
||||
Commands in state `run`: None.
|
||||
|
||||
Commands in state `fail`: None.
|
||||
|
||||
See [Firmware protocol in the Dev
|
||||
Handbook](http://dev.tillitis.se/protocol/#firmware-protocol) for the
|
||||
definition of the specific commands and their responses.
|
||||
|
||||
State changes from "initial" to "loading" when receiving `LOAD_APP`,
|
||||
which also sets the size of the number of data blocks to expect. After
|
||||
that we expect several `LOAD_APP_DATA` commands until the last block
|
||||
is received, when state is changed to "running".
|
||||
|
||||
In "running", the loaded device app is measured, the Compound Device
|
||||
Identifier (CDI) is computed, we do some cleanup of firmware data
|
||||
structures, flip to application mode, and finally start the app, which
|
||||
ends the firmware state machine.
|
||||
|
||||
The device app is now running in application mode. There is no other
|
||||
means of getting back from application mode to firmware mode than
|
||||
resetting/power cycling the device. Note that ROM is still accessible
|
||||
in the memory map, so it's still possible to execute firmware code in
|
||||
application mode, but with no privileged access.
|
||||
|
||||
Firmware loads the application at the start of RAM (`0x4000_0000`). It
|
||||
uses the special FW\_RAM for its own stack.
|
||||
|
||||
When the firmware starts it clears all FW\_RAM, then sets up a stack
|
||||
there before jumping to `main()`.
|
||||
|
||||
When reset is released, the CPU starts executing the firmware. It
|
||||
begins by clearing all CPU registers, and then sets up a stack for
|
||||
itself and then jumps to main().
|
||||
|
||||
Beginning at `main()` it sets up the "system calls", then fills the
|
||||
entire RAM with pseudo random data and setting up the RAM address and
|
||||
data hardware scrambling with values from the True Random Number
|
||||
Generator (TRNG). It then waits for data coming in through the UART.
|
||||
|
||||
Typical expected use scenario:
|
||||
|
||||
1. The client sends the `FW_CMD_LOAD_APP` command with the size of
|
||||
the device app and the optional 32 byte hash of the user-supplied
|
||||
secret as arguments and gets a `FW_RSP_LOAD_APP` back. After
|
||||
using this it's not possible to restart the loading of an
|
||||
application.
|
||||
|
||||
2. If the the client receive a sucessful response, it will send
|
||||
multiple `FW_CMD_LOAD_APP_DATA` commands, together containing the
|
||||
full application.
|
||||
|
||||
3. On receiving`FW_CMD_LOAD_APP_DATA` commands the firmware places
|
||||
the data into `0x4000_0000` and upwards. The firmware replies
|
||||
with a `FW_RSP_LOAD_APP_DATA` response to the client for each
|
||||
received block except the last data block.
|
||||
|
||||
4. When the final block of the application image is received with a
|
||||
`FW_CMD_LOAD_APP_DATA`, the firmware measure the application by
|
||||
computing a BLAKE2s digest over the entire application. Then
|
||||
firmware send back the `FW_RSP_LOAD_APP_DATA_READY` response
|
||||
containing the digest.
|
||||
|
||||
5. The Compound Device Identifier
|
||||
([CDI]((#compound-device-identifier-computation))) is then
|
||||
computed by doing a new BLAKE2s using the Unique Device Secret
|
||||
(UDS), the application digest, and any User Supplied Secret
|
||||
(USS).
|
||||
|
||||
6. The start address of the device app, currently `0x4000_0000`, is
|
||||
written to `APP_ADDR` and the size of the binary to `APP_SIZE` to
|
||||
let the device application know where it is loaded and how large
|
||||
it is, if it wants to relocate in RAM.
|
||||
|
||||
7. The firmware now clears the special `FW_RAM` where it keeps it
|
||||
stack. After this it performs no more function calls and uses no
|
||||
more automatic variables.
|
||||
|
||||
8. Firmware starts the application by first switching to from
|
||||
firmware mode to application mode by writing to the `SWITCH_APP`
|
||||
register. In this mode the MMIO region is restricted, e.g. some
|
||||
registers are removed (`UDS`), and some are switched from
|
||||
read/write to read-only (see [the memory
|
||||
map](https://dev.tillitis.se/memory/)).
|
||||
|
||||
Then the firmware jumps to what is in `APP_ADDR` which starts the
|
||||
application.
|
||||
|
||||
If during this whole time any commands are received which are not
|
||||
allowed in the current state, we enter the "failed" state and execute
|
||||
an illegal instruction. An illegal instruction traps the CPU and
|
||||
hardware blinks the status LED red until a power cycle. No further
|
||||
instructions are executed.
|
||||
|
||||
### User-supplied Secret (USS)
|
||||
|
||||
USS is a 32 bytes long secret provided by the user. Typically a host
|
||||
USS is a 32 bytes long secret provided by the user. Typically a client
|
||||
program gets a secret from the user and then does a key derivation
|
||||
function of some sort, for instance a BLAKE2s, to get 32 bytes which
|
||||
it sends to the firmware to be part of the CDI computation.
|
||||
@ -256,6 +239,10 @@ Then we continue with the CDI computation by updating with an optional
|
||||
USS and then finalizing the hash, storing the resulting digest in
|
||||
`CDI`.
|
||||
|
||||
We also clear the entire `FW_RAM` where the stack lived, including the
|
||||
BLAKE2s context with the UDS, very soon after that, just before
|
||||
jumping to the application.
|
||||
|
||||
### Firmware services
|
||||
|
||||
The firmware exposes a BLAKE2s function through a function pointer
|
||||
@ -283,11 +270,38 @@ typedef struct {
|
||||
```
|
||||
|
||||
The `libcommon` library in
|
||||
[tillitis-key1-apps](https://github.com/tillitis/tillitis-key1-apps/)
|
||||
has a wrapper for using this function called `blake2s()`.
|
||||
[tkey-libs](https://github.com/tillitis/tkey-libs/)
|
||||
has a wrapper for using this function called `blake2s()` which needs
|
||||
to be maintained if you do any changes to the firmware call.
|
||||
|
||||
## Applications
|
||||
## Developing firmware
|
||||
|
||||
See [our apps repo](https://github.com/tillitis/tillitis-key1-apps)
|
||||
for examples of client and TKey apps as well as libraries for writing
|
||||
both.
|
||||
Standing in `hw/application_fpga/` you can run `make firmware.elf` to
|
||||
build just the firmware. You don't need all the FPGA development
|
||||
tools. See [the Developer Handbook](https://dev.tillitis.se/tools/)
|
||||
for the tools you need. The easiest is probably to use your OCI image,
|
||||
`ghcr.io/tillitis/tkey-builder`.
|
||||
|
||||
[Our version of qemu](https://dev.tillitis.se/tools/#qemu-emulator) is
|
||||
also useful for debugging the firmware. You can attach GDB, use
|
||||
breakpoints, et cetera.
|
||||
|
||||
If you want to use plain debug prints you can remove `-DNOCONSOLE`
|
||||
from the `CFLAGS` in the Makefile and using the helper functions in
|
||||
`lib.c` like `htif_puts()` `htif_putinthex()` `htif_hexdump()` and
|
||||
friends for printf-like debugging. Note that these functions are only
|
||||
usable in qemu.
|
||||
|
||||
### Test firmware
|
||||
|
||||
The test firmware is in `testfw`. It's currently a bit of a hack and
|
||||
just runs through expected behavior of the hardware cores, giving
|
||||
special focus to access control in firmware mode and application mode.
|
||||
|
||||
It outputs results on the UART. This means that you have to attach a
|
||||
terminal program to the serial port device, even if it's running in
|
||||
qemu. It waits for you to type a character before starting the tests.
|
||||
|
||||
It needs to be compiled with `-Os` instead of `-O2` in `CFLAGS` in the
|
||||
ordinary `application_fpga/Makefile` to be able to fit in the 6 kByte
|
||||
ROM.
|
||||
|
Loading…
Reference in New Issue
Block a user