From f1534e5dada5acd3b2c26a237a37b9a6ec27b628 Mon Sep 17 00:00:00 2001 From: Michael Cardell Widerkrantz Date: Thu, 27 Jun 2024 22:23:49 +0200 Subject: [PATCH] doc: Update and expand firmware README MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Remove all text about other software than firmware. - Remove the Reset section. - Include a diagram and detailed explanation about the state machine in close vicinity. - Describe the test firmware. Co-authored-by: Joachim Strömbergson --- hw/application_fpga/fw/README.md | 376 ++++++++++++++++--------------- 1 file changed, 195 insertions(+), 181 deletions(-) diff --git a/hw/application_fpga/fw/README.md b/hw/application_fpga/fw/README.md index cb0629b..989b46b 100644 --- a/hw/application_fpga/fw/README.md +++ b/hw/application_fpga/fw/README.md @@ -1,201 +1,103 @@ -# Tillitis TKey software - -**NOTE:** Documentation migrated to dev.tillitis.se, this is kept for -history. This is likely to be outdated. +# Firmware ## Introduction -This text is both an introduction to and a requirement specification -of the TKey firmware, its protocol, and an overview of how TKey -applications are supposed to work. For an overview of the TKey -concepts, see [System Description](system_description.md). +This text is an introduction to, a requirement specification of, +and some implementation notes of the TKey firmware. It also gives a +few hint on developing and debugging the firmware. -First, some definitions: +This text is specific for the firmware. For a more general description +on how to implement device apps, see [the TKey Developer +Handbook](https://dev.tillitis.se/). + +## Definitions - Firmware - software in ROM responsible for loading applications. The - firmware is included as part of the FPGA bit stream. -- Application or app - software supplied by the host machine which is + firmware is included as part of the FPGA bitstream and not + replacable on a usual consumer TKey. +- Device application or app - software supplied by the client which is received, loaded, measured, and started by the firmware. +## CPU modes and firmware + The TKey has two modes of software operation: firmware mode and -application mode. The firmware mode has the responsibility of -receiving, measuring, loading, and starting the application. When the -firmware is about to start the application it switches to a more -constrained environment, the application mode. +application mode. The TKey always starts in firmware mode and starts +the firmware. When the firmware is about to start the application it +switches to a more constrained environment, the application mode. -The firmware and application uses a memory mapped input/output (MMIO) -for communication with the hardware. The memory map is constrained -when running in application mode, e.g. FW-RAM and UDS isn't readable, -and several MMIO addresses are either not readable or not writable for -the application. +The TKey hardware cores are memory mapped. Firmware has complete +access, except that the UDS is readable only once. The memory map is +constrained when running in application mode, e.g. FW\_RAM and UDS +isn't readable, and several other hardware addresses are either not +readable or not writable for the application. -See table in the [System -Description](system_description.md#memory-mapped-hardware-functions) -for details about access rules control in the memory system and MMIO. +See the table in [the Developer +Handbook](https://dev.tillitis.se/memory/) for an overview about the +memory access control. -The firmware (and optionally all software) on the TKey can communicate -to the host via the `UART_{RX,TX}_{STATUS,DATA}` registers, using the -framing protocol described in the [Framing +## Communication + +The firmware communicates to the host via the +`UART_{RX,TX}_{STATUS,DATA}` registers, using the framing protocol +described in the [Framing Protocol](https://dev.tillitis.se/protocol/). -The firmware defines a protocol on top of this framing layer which is +The firmware uses a protocol on top of this framing layer which is used to bootstrap the application. All commands are initiated by the -host. All commands receive a reply. See [Firmware +client. All commands receive a reply. See [Firmware protocol](#firmware-protocol) for specific details. -Applications define their own protocol used for communication with -their host part. They may or may not be based on the Framing Protocol. - -## CPU - -The CPU is a PicoRV32, a 32-bit RISC-V processor (arch: RV32IC\_Zmmul) -which runs the firmware and the application. The firmware and -application both run in RISC-V machine mode. All types are -little-endian. - -## Constraints +## Memory constraints - ROM: 6 kByte. +- FW\_RAM: 2 kByte. - RAM: 128 kByte. -## Firmware +## Firmware behaviour -The purpose of the firmware is to bootstrap and measure an -application. +The purpose of the firmware is to load, measure, and start an +application received from the client over the USB/UART. -The TKey has 128 kilobyte RAM. Firmware loads the application at the -start of RAM. The current C runtime (`crt0.S`) of apps in our [apps -repo](https://github.com/tillitis/tillitis-key1-apps) sets up the -stack to start just below the end of RAM. This means that a larger app -comes at the compromise of it having a smaller stack. - -The firmware is part of FPGA bitstream (ROM), and is loaded at -`0x0000_0000`. - -When the firmware starts it clears all RAM and then wait for commands -coming in on `UART_RX`. - -Typical use scenario: - - 1. The host sends the `FW_CMD_LOAD_APP` command with the size of the - device app and the optional user-supplied secret as arguments and - and gets a `FW_RSP_LOAD_APP` back. After using this it's not - possible to restart the loading of an application. - 2. If the the host receive a sucessful response, it will send - multiple `FW_CMD_LOAD_APP_DATA` commands, together containing the - full application. - 3. On receiving`FW_CMD_LOAD_APP_DATA` commands the firmware places - the data into `0x4000_0000` and upwards. The firmware replies - with a `FW_RSP_LOAD_APP_DATA` response to the host for each - received block except the last data block. - 4. When the final block of the application image is received with a - `FW_CMD_LOAD_APP_DATA`, the firmware measure the application by - computing a BLAKE2s digest over the entire application. Then - firmware send back the `FW_RSP_LOAD_APP_DATA_READY` response - containing the measurement. - 5. The Compound Device Identifier (CDI) is then computed by using - the `UDS`, application digest, and the `USS`, and placed in - `CDI`. (see [Compound Device Identifier - computation](#compound-device-identifier-computation)) Then the - start address of the device app, `0x4000_0000`, is written to - `APP_ADDR` and the size to `APP_SIZE` to let the device - application know where it is loaded and how large it is, if it - wants to relocate in RAM. - 6. The firmware now clears the special `FW_RAM` where it keeps it - stack. After this it does no more function calls and uses no more - automatic variables. - 7. Firmware starts the application by first switching to application - mode by writing to the `SWITCH_APP` register. In this mode the - MMIO region is restricted, e.g. some registers are removed - (`UDS`), and some are switched from read/write to read-only (see - [memory - map](system_description.md#memory-mapped-hardware-functions)). - - Then the firmware jumps to what's in `APP_ADDR` which starts - the application. - - There is now no other means of getting back from application mode - to firmware mode than resetting/power cycling the device. - -### Developing firmware - -Standing in `hw/application_fpga/` you can run `make firmware.elf` to -build just the firmware. You don't need all the FPGA development tools -mentioned in [Toolchain setup](../toolchain_setup.md). - -You need clang with 32 RISC-V support `-march=rv32iczmmul` which comes -in clang 15. If you don't have version 15 you might get by with -`-march=rv32imc` but things will break if you ever cause it to emit -`div` instructions. - -You also need `llvm-ar`, `llvm-objcopy`, `llvm-size` and `lld`. -Typically these are all available in packages called "clang", "llvm", -"lld" or similar. - -If your available `objcopy` and `size` commands is anything other than -the default `llvm-objcopy` and `llvm-size` define `OBJCOPY` and `SIZE` -to whatever they're called on your system before calling `make -firmware.elf`. - -If you want to use our emulator, clone the `tk1` branch of [our -version of qemu](https://github.com/tillitis/qemu) and build: - -``` -$ git clone -b tk1 https://github.com/tillitis/qemu -$ mkdir qemu/build -$ cd qemu/build -$ ../configure --target-list=riscv32-softmmu --disable-werror -$ make -j $(nproc) -``` - -(Built with warnings-as-errors disabled, see [this -issue](https://github.com/tillitis/qemu/issues/3).) - -Run it like this: - -``` -$ /path/to/qemu/build/qemu-system-riscv32 -nographic -M tk1,fifo=chrid -bios firmware.elf \ - -chardev pty,id=chrid -``` - -This attaches the FIFO to a tty, something like `/dev/pts/16` which -you can use with host software to talk to the firmware. - -To quit QEMU you can use: `Ctrl-a x` (see `Ctrl-a ?` for other commands). - -Debugging? Use the HTIF console by removing `-DNOCONSOLE` from the -`CFLAGS` and using the helper functions in `lib.c` like `htif_puts()` -`htif_putinthex()` `htif_hexdump()` and friends for printf-like -debugging. - -You can add `-d guest_errors` to the qemu commandline to make QEMU -send errors from the TK1 machine to stderr, typically things like -memory writes outside of mapped regions. - -You can also use the qemu monitor for debugging, e.g. `info -registers`, or run qemu with `-d in_asm` or `-d trace:riscv_trap`. - -### Reset - -The PicoRV32 starts executing at `0x0000_0000`. We allow no `.data` or -`.bss` sections. Our firmware starts at the `_start` symbol in -`start.S` which first clears all RAM for any remaining data from -previous applications, then initializes a stack starting at the top of -`FW_RAM` at `0xd000_0800` and downwards and then calls `main`. - -When the initialization is finished, the firmware waits for incoming -commands from the host, by busy-polling the `UART_RX_{STATUS,DATA}` -registers. When a complete command is read, the firmware executes the -command. +The firmware binary is part of the FPGA bitstream as the initial +values of the Block RAMs used to construct the `FW_ROM`. The `FW_ROM` +start address is located at `0x0000_0000` in the CPU memory map, which +is the CPU reset vector. ### Firmware state machine +This is the state diagram of the firmware. There are only four states. +Change of state occur when we receive specific I/O or a fatal error +occurs. + +```mermaid +stateDiagram-v2 + S1: initial + S2: loading + S3: running + SE: failed + + [*] --> S1 + + S1 --> S1: Commands + S1 --> S2: LOAD_APP + S1 --> SE: Error + + S2 --> S2: LOAD_APP_DATA + S2 --> S3: Last block received + S2 --> SE: Error + + S3 --> [*] +``` + States: -- `initial` - At start. -- `loading` - Expect application data. -- `run` - Computes CDI and starts the application. -- `fail` - Stops waiting for commands, flashes LED forever. +- `initial` - At start. Allows the commands `NAME_VERSION`, `GET_UDI`, + `LOAD_APP`. +- `loading` - Expect application data. Allows only the command + `LOAD_APP_DATA`. +- `run` - Computes CDI and starts the application. Allows no commands. +- `fail` - Stops waiting for commands, flashes LED forever. Allows no + commands. Commands in state `initial`: @@ -212,17 +114,98 @@ Commands in state `loading`: |------------------------|----------------------------------| | `FW_CMD_LOAD_APP_DATA` | unchanged or `run` on last chunk | -Commands in state `run`: None. - -Commands in state `fail`: None. - See [Firmware protocol in the Dev Handbook](http://dev.tillitis.se/protocol/#firmware-protocol) for the definition of the specific commands and their responses. +State changes from "initial" to "loading" when receiving `LOAD_APP`, +which also sets the size of the number of data blocks to expect. After +that we expect several `LOAD_APP_DATA` commands until the last block +is received, when state is changed to "running". + +In "running", the loaded device app is measured, the Compound Device +Identifier (CDI) is computed, we do some cleanup of firmware data +structures, flip to application mode, and finally start the app, which +ends the firmware state machine. + +The device app is now running in application mode. There is no other +means of getting back from application mode to firmware mode than +resetting/power cycling the device. Note that ROM is still accessible +in the memory map, so it's still possible to execute firmware code in +application mode, but with no privileged access. + +Firmware loads the application at the start of RAM (`0x4000_0000`). It +uses the special FW\_RAM for its own stack. + +When the firmware starts it clears all FW\_RAM, then sets up a stack +there before jumping to `main()`. + +When reset is released, the CPU starts executing the firmware. It +begins by clearing all CPU registers, and then sets up a stack for +itself and then jumps to main(). + +Beginning at `main()` it sets up the "system calls", then fills the +entire RAM with pseudo random data and setting up the RAM address and +data hardware scrambling with values from the True Random Number +Generator (TRNG). It then waits for data coming in through the UART. + +Typical expected use scenario: + + 1. The client sends the `FW_CMD_LOAD_APP` command with the size of + the device app and the optional 32 byte hash of the user-supplied + secret as arguments and gets a `FW_RSP_LOAD_APP` back. After + using this it's not possible to restart the loading of an + application. + + 2. If the the client receive a sucessful response, it will send + multiple `FW_CMD_LOAD_APP_DATA` commands, together containing the + full application. + + 3. On receiving`FW_CMD_LOAD_APP_DATA` commands the firmware places + the data into `0x4000_0000` and upwards. The firmware replies + with a `FW_RSP_LOAD_APP_DATA` response to the client for each + received block except the last data block. + + 4. When the final block of the application image is received with a + `FW_CMD_LOAD_APP_DATA`, the firmware measure the application by + computing a BLAKE2s digest over the entire application. Then + firmware send back the `FW_RSP_LOAD_APP_DATA_READY` response + containing the digest. + + 5. The Compound Device Identifier + ([CDI]((#compound-device-identifier-computation))) is then + computed by doing a new BLAKE2s using the Unique Device Secret + (UDS), the application digest, and any User Supplied Secret + (USS). + + 6. The start address of the device app, currently `0x4000_0000`, is + written to `APP_ADDR` and the size of the binary to `APP_SIZE` to + let the device application know where it is loaded and how large + it is, if it wants to relocate in RAM. + + 7. The firmware now clears the special `FW_RAM` where it keeps it + stack. After this it performs no more function calls and uses no + more automatic variables. + + 8. Firmware starts the application by first switching to from + firmware mode to application mode by writing to the `SWITCH_APP` + register. In this mode the MMIO region is restricted, e.g. some + registers are removed (`UDS`), and some are switched from + read/write to read-only (see [the memory + map](https://dev.tillitis.se/memory/)). + + Then the firmware jumps to what is in `APP_ADDR` which starts the + application. + +If during this whole time any commands are received which are not +allowed in the current state, we enter the "failed" state and execute +an illegal instruction. An illegal instruction traps the CPU and +hardware blinks the status LED red until a power cycle. No further +instructions are executed. + ### User-supplied Secret (USS) -USS is a 32 bytes long secret provided by the user. Typically a host +USS is a 32 bytes long secret provided by the user. Typically a client program gets a secret from the user and then does a key derivation function of some sort, for instance a BLAKE2s, to get 32 bytes which it sends to the firmware to be part of the CDI computation. @@ -256,6 +239,10 @@ Then we continue with the CDI computation by updating with an optional USS and then finalizing the hash, storing the resulting digest in `CDI`. +We also clear the entire `FW_RAM` where the stack lived, including the +BLAKE2s context with the UDS, very soon after that, just before +jumping to the application. + ### Firmware services The firmware exposes a BLAKE2s function through a function pointer @@ -283,11 +270,38 @@ typedef struct { ``` The `libcommon` library in -[tillitis-key1-apps](https://github.com/tillitis/tillitis-key1-apps/) -has a wrapper for using this function called `blake2s()`. +[tkey-libs](https://github.com/tillitis/tkey-libs/) +has a wrapper for using this function called `blake2s()` which needs +to be maintained if you do any changes to the firmware call. -## Applications +## Developing firmware -See [our apps repo](https://github.com/tillitis/tillitis-key1-apps) -for examples of client and TKey apps as well as libraries for writing -both. +Standing in `hw/application_fpga/` you can run `make firmware.elf` to +build just the firmware. You don't need all the FPGA development +tools. See [the Developer Handbook](https://dev.tillitis.se/tools/) +for the tools you need. The easiest is probably to use your OCI image, +`ghcr.io/tillitis/tkey-builder`. + +[Our version of qemu](https://dev.tillitis.se/tools/#qemu-emulator) is +also useful for debugging the firmware. You can attach GDB, use +breakpoints, et cetera. + +If you want to use plain debug prints you can remove `-DNOCONSOLE` +from the `CFLAGS` in the Makefile and using the helper functions in +`lib.c` like `htif_puts()` `htif_putinthex()` `htif_hexdump()` and +friends for printf-like debugging. Note that these functions are only +usable in qemu. + +### Test firmware + +The test firmware is in `testfw`. It's currently a bit of a hack and +just runs through expected behavior of the hardware cores, giving +special focus to access control in firmware mode and application mode. + +It outputs results on the UART. This means that you have to attach a +terminal program to the serial port device, even if it's running in +qemu. It waits for you to type a character before starting the tests. + +It needs to be compiled with `-Os` instead of `-O2` in `CFLAGS` in the +ordinary `application_fpga/Makefile` to be able to fit in the 6 kByte +ROM.