| .. | ||
| testfw | ||
| tk1 | ||
| .clang-format | ||
| README.md | ||
| tk1_mem.h | ||
Tillitis TKey software
NOTE: Documentation migrated to dev.tillitis.se, this is kept for history. This is likely to be outdated.
Introduction
This text is both an introduction to and a requirement specification of the TKey firmware, its protocol, and an overview of how TKey applications are supposed to work. For an overview of the TKey concepts, see System Description.
First, some definitions:
- Firmware - software in ROM responsible for loading applications. The firmware is included as part of the FPGA bit stream.
- Application or app - software supplied by the host machine which is received, loaded, measured, and started by the firmware.
The TKey has two modes of software operation: firmware mode and application mode. The firmware mode has the responsibility of receiving, measuring, loading, and starting the application. When the firmware is about to start the application it switches to a more constrained environment, the application mode.
The firmware and application uses a memory mapped input/output (MMIO) for communication with the hardware. The memory map is constrained when running in application mode, e.g. FW-RAM and UDS isn't readable, and several MMIO addresses are either not readable or not writable for the application.
See table in the System Description for details about access rules control in the memory system and MMIO.
The firmware (and optionally all software) on the TKey can communicate
to the host via the UART_{RX,TX}_{STATUS,DATA} registers, using the
framing protocol described in the Framing
Protocol.
The firmware defines a protocol on top of this framing layer which is used to bootstrap the application. All commands are initiated by the host. All commands receive a reply. See Firmware protocol for specific details.
Applications define their own protocol used for communication with their host part. They may or may not be based on the Framing Protocol.
CPU
The CPU is a PicoRV32, a 32-bit RISC-V processor (arch: RV32IC_Zmmul) which runs the firmware and the application. The firmware and application both run in RISC-V machine mode. All types are little-endian.
Constraints
- ROM: 6 kByte.
- RAM: 128 kByte.
Firmware
The purpose of the firmware is to bootstrap and measure an application.
The TKey has 128 kilobyte RAM. Firmware loads the application at the
start of RAM. The current C runtime (crt0.S) of apps in our apps
repo sets up the
stack to start just below the end of RAM. This means that a larger app
comes at the compromise of it having a smaller stack.
The firmware is part of FPGA bitstream (ROM), and is loaded at
0x0000_0000.
When the firmware starts it clears all RAM and then wait for commands
coming in on UART_RX.
Typical use scenario:
-
The host sends the
FW_CMD_LOAD_APPcommand with the size of the device app and the optional user-supplied secret as arguments and and gets aFW_RSP_LOAD_APPback. After using this it's not possible to restart the loading of an application. -
If the the host receive a sucessful response, it will send multiple
FW_CMD_LOAD_APP_DATAcommands, together containing the full application. -
On receiving
FW_CMD_LOAD_APP_DATAcommands the firmware places the data into0x4000_0000and upwards. The firmware replies with aFW_RSP_LOAD_APP_DATAresponse to the host for each received block except the last data block. -
When the final block of the application image is received with a
FW_CMD_LOAD_APP_DATA, the firmware measure the application by computing a BLAKE2s digest over the entire application. Then firmware send back theFW_RSP_LOAD_APP_DATA_READYresponse containing the measurement. -
The Compound Device Identifier (CDI) is then computed by using the
UDS, application digest, and theUSS, and placed inCDI. (see Compound Device Identifier computation) Then the start address of the device app,0x4000_0000, is written toAPP_ADDRand the size toAPP_SIZEto let the device application know where it is loaded and how large it is, if it wants to relocate in RAM. -
The firmware now clears the special
FW_RAMwhere it keeps it stack. After this it does no more function calls and uses no more automatic variables. -
Firmware starts the application by first switching to application mode by writing to the
SWITCH_APPregister. In this mode the MMIO region is restricted, e.g. some registers are removed (UDS), and some are switched from read/write to read-only (see memory map).Then the firmware jumps to what's in
APP_ADDRwhich starts the application.There is now no other means of getting back from application mode to firmware mode than resetting/power cycling the device.
Developing firmware
Standing in hw/application_fpga/ you can run make firmware.elf to
build just the firmware. You don't need all the FPGA development tools
mentioned in Toolchain setup.
You need clang with 32 RISC-V support -march=rv32iczmmul which comes
in clang 15. If you don't have version 15 you might get by with
-march=rv32imc but things will break if you ever cause it to emit
div instructions.
You also need llvm-ar, llvm-objcopy, llvm-size and lld.
Typically these are all available in packages called "clang", "llvm",
"lld" or similar.
If your available objcopy and size commands is anything other than
the default llvm-objcopy and llvm-size define OBJCOPY and SIZE
to whatever they're called on your system before calling make firmware.elf.
If you want to use our emulator, clone the tk1 branch of our
version of qemu and build:
$ git clone -b tk1 https://github.com/tillitis/qemu
$ mkdir qemu/build
$ cd qemu/build
$ ../configure --target-list=riscv32-softmmu --disable-werror
$ make -j $(nproc)
(Built with warnings-as-errors disabled, see this issue.)
Run it like this:
$ /path/to/qemu/build/qemu-system-riscv32 -nographic -M tk1,fifo=chrid -bios firmware.elf \
-chardev pty,id=chrid
This attaches the FIFO to a tty, something like /dev/pts/16 which
you can use with host software to talk to the firmware.
To quit QEMU you can use: Ctrl-a x (see Ctrl-a ? for other commands).
Debugging? Use the HTIF console by removing -DNOCONSOLE from the
CFLAGS and using the helper functions in lib.c like htif_puts()
htif_putinthex() htif_hexdump() and friends for printf-like
debugging.
You can add -d guest_errors to the qemu commandline to make QEMU
send errors from the TK1 machine to stderr, typically things like
memory writes outside of mapped regions.
You can also use the qemu monitor for debugging, e.g. info registers, or run qemu with -d in_asm or -d trace:riscv_trap.
Reset
The PicoRV32 starts executing at 0x0000_0000. We allow no .data or
.bss sections. Our firmware starts at the _start symbol in
start.S which first clears all RAM for any remaining data from
previous applications, then initializes a stack starting at the top of
FW_RAM at 0xd000_0800 and downwards and then calls main.
When the initialization is finished, the firmware waits for incoming
commands from the host, by busy-polling the UART_RX_{STATUS,DATA}
registers. When a complete command is read, the firmware executes the
command.
Firmware state machine
States:
initial- At start.loading- Expect application data.run- Computes CDI and starts the application.fail- Stops waiting for commands, flashes LED forever.
Commands in state initial:
| command | next state |
|---|---|
FW_CMD_NAME_VERSION |
unchanged |
FW_CMD_GET_UDI |
unchanged |
FW_CMD_LOAD_APP |
loading |
Commands in state loading:
| command | next state |
|---|---|
FW_CMD_LOAD_APP_DATA |
unchanged or run on last chunk |
Commands in state run: None.
Commands in state fail: None.
See Firmware protocol in the Dev Handbook for the definition of the specific commands and their responses.
User-supplied Secret (USS)
USS is a 32 bytes long secret provided by the user. Typically a host program gets a secret from the user and then does a key derivation function of some sort, for instance a BLAKE2s, to get 32 bytes which it sends to the firmware to be part of the CDI computation.
Compound Device Identifier computation
The CDI is computed by:
CDI = blake2s(UDS, blake2s(app), USS)
In an ideal world, software would never be able to read UDS at all and we would have a BLAKE2s function in hardware that would be the only thing able to read the UDS. Unfortunately, we couldn't fit a BLAKE2s implementation in the FPGA at this time.
The firmware instead does the CDI computation using the special
firmware-only FW_RAM which is invisible after switching to app mode.
We keep the entire firmware stack in FW_RAM and clear it just before
switching to app mode just in case.
We sleep for a random number of cycles before reading out the UDS,
call blake2s_update() with it and then immediately call
blake2s_update() again with the program digest, destroying the UDS
stored in the internal context buffer. UDS should now not be in
FW_RAM anymore. We can read UDS only once per power cycle so UDS
should now not be available to firmware at all.
Then we continue with the CDI computation by updating with an optional
USS and then finalizing the hash, storing the resulting digest in
CDI.
Firmware services
The firmware exposes a BLAKE2s function through a function pointer
located in MMIO BLAKE2S (see memory
map) with the
with function signature:
int blake2s(void *out, unsigned long outlen, const void *key,
unsigned long keylen, const void *in, unsigned long inlen,
blake2s_ctx *ctx);
where blake2s_ctx is:
typedef struct {
uint8_t b[64]; // input buffer
uint32_t h[8]; // chained state
uint32_t t[2]; // total number of bytes
size_t c; // pointer for b[]
size_t outlen; // digest size
} blake2s_ctx;
The libcommon library in
tillitis-key1-apps
has a wrapper for using this function called blake2s().
Applications
See our apps repo for examples of client and TKey apps as well as libraries for writing both.