doc: Move software.md to fw/README

2025-08-05 21:24:11 -04:00 · 2024-06-27 22:22:14 +02:00 · 2024-06-27 22:22:14 +02:00 · cc16c8481c
commit cc16c8481c
parent 058c8e970c
1 changed files with 0 additions and 0 deletions
--- a/hw/application_fpga/fw/README.md
+++ b/hw/application_fpga/fw/README.md
@ -0,0 +1,293 @@
+# Tillitis TKey software
+
+**NOTE:** Documentation migrated to dev.tillitis.se, this is kept for
+history. This is likely to be outdated.
+
+## Introduction
+
+This text is both an introduction to and a requirement specification
+of the TKey firmware, its protocol, and an overview of how TKey
+applications are supposed to work. For an overview of the TKey
+concepts, see [System Description](system_description.md).
+
+First, some definitions:
+
+- Firmware - software in ROM responsible for loading applications. The
+  firmware is included as part of the FPGA bit stream.
+- Application or app - software supplied by the host machine which is
+  received, loaded, measured, and started by the firmware.
+
+The TKey has two modes of software operation: firmware mode and
+application mode. The firmware mode has the responsibility of
+receiving, measuring, loading, and starting the application. When the
+firmware is about to start the application it switches to a more
+constrained environment, the application mode.
+
+The firmware and application uses a memory mapped input/output (MMIO)
+for communication with the hardware. The memory map is constrained
+when running in application mode, e.g. FW-RAM and UDS isn't readable,
+and several MMIO addresses are either not readable or not writable for
+the application.
+
+See table in the [System
+Description](system_description.md#memory-mapped-hardware-functions)
+for details about access rules control in the memory system and MMIO.
+
+The firmware (and optionally all software) on the TKey can communicate
+to the host via the `UART_{RX,TX}_{STATUS,DATA}` registers, using the
+framing protocol described in the [Framing
+Protocol](https://dev.tillitis.se/protocol/).
+
+The firmware defines a protocol on top of this framing layer which is
+used to bootstrap the application. All commands are initiated by the
+host. All commands receive a reply. See [Firmware
+protocol](#firmware-protocol) for specific details.
+
+Applications define their own protocol used for communication with
+their host part. They may or may not be based on the Framing Protocol.
+
+## CPU
+
+The CPU is a PicoRV32, a 32-bit RISC-V processor (arch: RV32IC\_Zmmul)
+which runs the firmware and the application. The firmware and
+application both run in RISC-V machine mode. All types are
+little-endian.
+
+## Constraints
+
+- ROM: 6 kByte.
+- RAM: 128 kByte.
+
+## Firmware
+
+The purpose of the firmware is to bootstrap and measure an
+application.
+
+The TKey has 128 kilobyte RAM. Firmware loads the application at the
+start of RAM. The current C runtime (`crt0.S`) of apps in our [apps
+repo](https://github.com/tillitis/tillitis-key1-apps) sets up the
+stack to start just below the end of RAM. This means that a larger app
+comes at the compromise of it having a smaller stack.
+
+The firmware is part of FPGA bitstream (ROM), and is loaded at
+`0x0000_0000`.
+
+When the firmware starts it clears all RAM and then wait for commands
+coming in on `UART_RX`.
+
+Typical use scenario:
+
+  1. The host sends the `FW_CMD_LOAD_APP` command with the size of the
+     device app and the optional user-supplied secret as arguments and
+     and gets a `FW_RSP_LOAD_APP` back. After using this it's not
+     possible to restart the loading of an application.
+  2. If the the host receive a sucessful response, it will send
+     multiple `FW_CMD_LOAD_APP_DATA` commands, together containing the
+     full application.
+  3. On receiving`FW_CMD_LOAD_APP_DATA` commands the firmware places
+     the data into `0x4000_0000` and upwards. The firmware replies
+     with a `FW_RSP_LOAD_APP_DATA` response to the host for each
+     received block except the last data block.
+  4. When the final block of the application image is received with a
+     `FW_CMD_LOAD_APP_DATA`, the firmware measure the application by
+     computing a BLAKE2s digest over the entire application. Then
+     firmware send back the `FW_RSP_LOAD_APP_DATA_READY` response
+     containing the measurement.
+  5. The Compound Device Identifier (CDI) is then computed by using
+     the `UDS`, application digest, and the `USS`, and placed in
+     `CDI`. (see [Compound Device Identifier
+     computation](#compound-device-identifier-computation)) Then the
+     start address of the device app, `0x4000_0000`, is written to
+     `APP_ADDR` and the size to `APP_SIZE` to let the device
+     application know where it is loaded and how large it is, if it
+     wants to relocate in RAM.
+  6. The firmware now clears the special `FW_RAM` where it keeps it
+     stack. After this it does no more function calls and uses no more
+     automatic variables.
+  7. Firmware starts the application by first switching to application
+     mode by writing to the `SWITCH_APP` register. In this mode the
+     MMIO region is restricted, e.g. some registers are removed
+     (`UDS`), and some are switched from read/write to read-only (see
+     [memory
+     map](system_description.md#memory-mapped-hardware-functions)).
+
+     Then the firmware jumps to what's in `APP_ADDR` which starts
+     the application.
+
+     There is now no other means of getting back from application mode
+     to firmware mode than resetting/power cycling the device.
+
+### Developing firmware
+
+Standing in `hw/application_fpga/` you can run `make firmware.elf` to
+build just the firmware. You don't need all the FPGA development tools
+mentioned in [Toolchain setup](../toolchain_setup.md).
+
+You need clang with 32 RISC-V support `-march=rv32iczmmul` which comes
+in clang 15. If you don't have version 15 you might get by with
+`-march=rv32imc` but things will break if you ever cause it to emit
+`div` instructions.
+
+You also need `llvm-ar`, `llvm-objcopy`, `llvm-size` and `lld`.
+Typically these are all available in packages called "clang", "llvm",
+"lld" or similar.
+
+If your available `objcopy` and `size` commands is anything other than
+the default `llvm-objcopy` and `llvm-size` define `OBJCOPY` and `SIZE`
+to whatever they're called on your system before calling `make
+firmware.elf`.
+
+If you want to use our emulator, clone the `tk1` branch of [our
+version of qemu](https://github.com/tillitis/qemu) and build:
+
+```
+$ git clone -b tk1 https://github.com/tillitis/qemu
+$ mkdir qemu/build
+$ cd qemu/build
+$ ../configure --target-list=riscv32-softmmu --disable-werror
+$ make -j $(nproc)
+```
+
+(Built with warnings-as-errors disabled, see [this
+issue](https://github.com/tillitis/qemu/issues/3).)
+
+Run it like this:
+
+```
+$ /path/to/qemu/build/qemu-system-riscv32 -nographic -M tk1,fifo=chrid -bios firmware.elf \
+  -chardev pty,id=chrid
+```
+
+This attaches the FIFO to a tty, something like `/dev/pts/16` which
+you can use with host software to talk to the firmware.
+
+To quit QEMU you can use: `Ctrl-a x` (see `Ctrl-a ?` for other commands).
+
+Debugging? Use the HTIF console by removing `-DNOCONSOLE` from the
+`CFLAGS` and using the helper functions in `lib.c` like `htif_puts()`
+`htif_putinthex()` `htif_hexdump()` and friends for printf-like
+debugging.
+
+You can add `-d guest_errors` to the qemu commandline to make QEMU
+send errors from the TK1 machine to stderr, typically things like
+memory writes outside of mapped regions.
+
+You can also use the qemu monitor for debugging, e.g. `info
+registers`, or run qemu with `-d in_asm` or `-d trace:riscv_trap`.
+
+### Reset
+
+The PicoRV32 starts executing at `0x0000_0000`. We allow no `.data` or
+`.bss` sections. Our firmware starts at the `_start` symbol in
+`start.S` which first clears all RAM for any remaining data from
+previous applications, then initializes a stack starting at the top of
+`FW_RAM` at `0xd000_0800` and downwards and then calls `main`.
+
+When the initialization is finished, the firmware waits for incoming
+commands from the host, by busy-polling the `UART_RX_{STATUS,DATA}`
+registers. When a complete command is read, the firmware executes the
+command.
+
+### Firmware state machine
+
+States:
+
+- `initial` - At start.
+- `loading` - Expect application data.
+- `run` - Computes CDI and starts the application.
+- `fail` - Stops waiting for commands, flashes LED forever.
+
+Commands in state `initial`:
+
+| *command*             | *next state* |
+|-----------------------|--------------|
+| `FW_CMD_NAME_VERSION` | unchanged    |
+| `FW_CMD_GET_UDI`      | unchanged    |
+| `FW_CMD_LOAD_APP`     | `loading`    |
+|                       |              |
+
+Commands in state `loading`:
+
+| *command*              | *next state*                     |
+|------------------------|----------------------------------|
+| `FW_CMD_LOAD_APP_DATA` | unchanged or `run` on last chunk |
+
+Commands in state `run`: None.
+
+Commands in state `fail`: None.
+
+See [Firmware protocol in the Dev
+Handbook](http://dev.tillitis.se/protocol/#firmware-protocol) for the
+definition of the specific commands and their responses.
+
+### User-supplied Secret (USS)
+
+USS is a 32 bytes long secret provided by the user. Typically a host
+program gets a secret from the user and then does a key derivation
+function of some sort, for instance a BLAKE2s, to get 32 bytes which
+it sends to the firmware to be part of the CDI computation.
+
+### Compound Device Identifier computation
+
+The CDI is computed by:
+
+```
+CDI = blake2s(UDS, blake2s(app), USS)
+```
+
+In an ideal world, software would never be able to read UDS at all and
+we would have a BLAKE2s function in hardware that would be the only
+thing able to read the UDS. Unfortunately, we couldn't fit a BLAKE2s
+implementation in the FPGA at this time.
+
+The firmware instead does the CDI computation using the special
+firmware-only `FW_RAM` which is invisible after switching to app mode.
+We keep the entire firmware stack in `FW_RAM` and clear it just before
+switching to app mode just in case.
+
+We sleep for a random number of cycles before reading out the UDS,
+call `blake2s_update()` with it and then immediately call
+`blake2s_update()` again with the program digest, destroying the UDS
+stored in the internal context buffer. UDS should now not be in
+`FW_RAM` anymore. We can read UDS only once per power cycle so UDS
+should now not be available to firmware at all.
+
+Then we continue with the CDI computation by updating with an optional
+USS and then finalizing the hash, storing the resulting digest in
+`CDI`.
+
+### Firmware services
+
+The firmware exposes a BLAKE2s function through a function pointer
+located in MMIO `BLAKE2S` (see [memory
+map](system_description.md#memory-mapped-hardware-functions)) with the
+with function signature:
+
+```c
+int blake2s(void *out, unsigned long outlen, const void *key,
+	    unsigned long keylen, const void *in, unsigned long inlen,
+	    blake2s_ctx *ctx);
+
+```
+
+where `blake2s_ctx` is:
+
+```c
+typedef struct {
+	uint8_t b[64]; // input buffer
+	uint32_t h[8]; // chained state
+	uint32_t t[2]; // total number of bytes
+	size_t c;      // pointer for b[]
+	size_t outlen; // digest size
+} blake2s_ctx;
+```
+
+The `libcommon` library in
+[tillitis-key1-apps](https://github.com/tillitis/tillitis-key1-apps/)
+has a wrapper for using this function called `blake2s()`.
+
+## Applications
+
+See [our apps repo](https://github.com/tillitis/tillitis-key1-apps)
+for examples of client and TKey apps as well as libraries for writing
+both.