Reorganize files to account for new "External" section

QubesOS/qubes-issues#4693
This commit is contained in:
Andrew David Wong 2019-05-26 19:32:45 -05:00
parent 5cc99a23d1
commit d31c786942
No known key found for this signature in database
GPG key ID: 8CE137352A019A17
203 changed files with 0 additions and 0 deletions

View file

@ -0,0 +1,36 @@
---
layout: doc
title: Architecture
permalink: /doc/architecture/
redirect_from:
- /doc/qubes-architecture/
- /en/doc/qubes-architecture/
- /doc/QubesArchitecture/
- /wiki/QubesArchitecture/
---
Qubes Architecture Overview
===========================
Qubes implements a Security by Isolation approach. To do this, Qubes utilizes virtualization technology in order to isolate various programs from each other and even to sandbox many system-level components, such as networking and storage subsystems, so that the compromise of any of these programs or components does not affect the integrity of the rest of the system.
Qubes lets the user define many security domains, which are implemented as lightweight Virtual Machines (VMs), or “AppVMs.” For example, the user can have “personal,” “work,” “shopping,” “bank,” and “random” AppVMs and can use the applications within those VMs just as if they were executing on the local machine. At the same time, however, these applications are well isolated from each other. Qubes also supports secure copy-and-paste and file sharing between the AppVMs, of course.
[![qubes-arch-diagram-1.png](/attachment/wiki/QubesArchitecture/qubes-arch-diagram-1.png)](/attachment/wiki/QubesArchitecture/qubes-arch-diagram-1.png)
(Note: In the diagram above, "Storage domain" is actually a USB domain.)
Key Architecture features
-------------------------
- Based on a secure bare-metal hypervisor (Xen)
- Networking code sand-boxed in an unprivileged VM (using IOMMU/VT-d)
- USB stacks and drivers sand-boxed in an unprivileged VM (currently experimental feature)
- No networking code in the privileged domain (dom0)
- All user applications run in “AppVMs,” lightweight VMs based on Linux
- Centralized updates of all AppVMs based on the same template
- Qubes GUI virtualization presents applications as if they were running locally
- Qubes GUI provides isolation between apps sharing the same desktop
- Secure system boot based (optional)
[Architecture Spec v0.3 [PDF]](/attachment/wiki/QubesArchitecture/arch-spec-0.3.pdf) (The original 2009 document that started this all...)

70
developer/system/audio.md Normal file
View file

@ -0,0 +1,70 @@
---
layout: doc
title: Audio Virtualization
permalink: /doc/audio-virtualization/
---
Audio Virtualization
====================
VMs on Qubes OS have access to virtualized audio through the PulseAudio module.
It consists of two parts:
- `pacat-simple-vchan` running in a dom0/Audio VM (standalone application, one per VM, connected to the PulseAudio daemon)
- `module-vchan-sink` running in a VM (loaded into the PulseAudio process)
Protocol
--------
The protocol between these two parts is designed to be as simple as possible.
Specifically, there is no audio format negotiation, no compression, etc.
All the audio data is transferred as raw 44.1kHz samples in S16LE format, 2 channels.
These two parts are connected with two vchan links:
1. Connection on vchan port 4713 used to transfer audio from VM to dom0/Audio VM.
There is no negotiation, no headers, etc.
Only raw samples are sent over this channel.
2. Connection on vchan port 4714 used to transfer audio from dom0/Audio VM to VM.
Like the previous one, audio data is sent as raw samples.
Additionally, the second vchan connection (on port 4714) is used in the opposite direction (VM to dom0) to send simple notifications/commands about the audio state.
Each such notification is a 4-byte number in little-endian format.
List of defined codes:
- `0x00010001` -- VM wants to receive audio input (some process is listening); prior to this message, `pacat-simple-vchan` will not send any audio samples to the VM.
- `0x00010000` -- VM does not want to receive audio input (no process is listening anymore); after this message, `pacat-simple-vchan` will not send any audio samples to the VM.
- `0x00020000` -- VM does not want to send audio output; informational for dom0, to avoid buffer under runs (may affect PulseAudio calculated delays).
- `0x00020001` -- VM does want to send audio output.
pacat-simple-vchan
------------------
This is the dom0 (or Audio VM) part of the audio virtualization.
It is responsible for transferring audio samples between the PulseAudio daemon in dom0/Audio VM (which has access to the actual audio hardware) and a VM playing/recording sounds.
A separate `pacat-simple-vchan` process runs in dom0 for each VM with audio.
Each of them opens separate input and output streams to their local PulseAudio, which allows for controlling the volume of each VM separately (using the `pavucontrol` tool, for example).
Audio input to the VM is not sent by default.
It needs to be both requested by the VM part and explicitly enabled in `pacat-simple-vchan`.
The mechanism to do this differs between Qubes versions.
In Qubes before R4.1, `pacat-simple-vchan` is controlled over system D-Bus:
- destination: `org.qubesos.Audio.VMNAME` (where `VMNAME` is the VM's name)
- object path: `/org/qubesos/audio`
- interface: `org.qubesos.Audio`
- property: `RecAllowed` (which can be set using the `org.freedesktop.DBus.Properties` interface)
In Qubes R4.1 and later, `pacat-simple-vchan` is controlled over a UNIX socket at `/var/run/qubes/audio-control.VMNAME` (where `VMNAME` is the VM's name).
Supported commands:
- `audio-input 1\n` - enable audio input
- `audio-input 0\n` - disable audio input
These commands can be sent using the `qubes.AudioInputEnable+VMNAME` and `qubes.AudioInputDisable+VMNAME` qrexec services, respectively.
The current status is written into QubesDB at `/audio-input/VMNAME` (where `VMNAME` is the VM's name) with either `1` or `0` values.
The lack of a key means that the `pacat-simple-vchan` for a given VM is not running.
In either version, it is exposed to the user as device of class `mic`, which can be attached to a VM (for example, using the `qvm-device mic` command).

412
developer/system/gui.md Normal file
View file

@ -0,0 +1,412 @@
---
layout: doc
title: GUI
permalink: /doc/gui/
redirect_from:
- /en/doc/gui/
- /en/doc/gui-docs/
- /doc/GUIdocs/
- /wiki/GUIdocs/
---
Qubes GUI protocol
==================
qubes_gui and qubes_guid processes
------------------------------------
All AppVM X applications connect to local (running in AppVM) Xorg servers that use the following "hardware" drivers:
- *dummyqsb_drv* - video driver, that paints onto a framebuffer located in RAM, not connected to real hardware
- *qubes_drv* - it provides a virtual keyboard and mouse (in fact, more, see below)
For each AppVM, there is a pair of *qubes_gui* (running in AppVM) and *qubes_guid* (running in dom0) processes connected over vchan.
The main responsibilities of *qubes_gui* are:
- call XCompositeRedirectSubwindows on the root window, so that each window has its own composition buffer
- instruct the local Xorg server to notify it about window creation, configuration and damage events; pass information on these events to dom0
- receive information about keyboard and mouse events from dom0, tell *qubes_drv* to fake appropriate events
- receive information about window size/position change, apply them to the local window
The main responsibilities of *qubes_guid* are:
- create a window in dom0 whenever an information on window creation in AppVM is received from *qubes_gui*
- whenever the local window receives XEvent, pass information on it to AppVM (particularly, mouse and keyboard data)
- whenever AppVM signals damage event, tell local Xorg server to repaint a given window fragment
- receive information about window size/position change, apply them to the local window
Note that keyboard and mouse events are passed to AppVM only if a window belonging to this AppVM has focus.
AppVM has no way to get information on keystrokes fed to other AppVMs (e.g. XTEST extension will report the status of local AppVM keyboard only) or synthesize and pass events to other AppVMs.
Window content updates implementation
-------------------------------------
Typical remote desktop applications, like *vnc*, pass information on all changed window content in-band (say, over tcp).
As that channel has limited throughput, this impacts video performance.
In the case of Qubes, *qubes_gui* does not transfer all changed pixels via vchan. Instead, for each window, upon its creation or size change, *qubes_gui*
- asks *qubes_drv* driver for the list of physical memory frames that hold the composition buffer of a window
- passes this information via `MFNDUMP` message to *qubes_guid* in dom0
Now, *qubes_guid* has to tell the dom0 Xorg server about the location of the buffer.
There is no supported way (e.g. Xorg extension) to do this zero-copy style.
The following method is used in Qubes:
- in dom0, the Xorg server is started with *LD_PRELOAD*-ed library named *shmoverride.so*. This library hooks all function calls related to shared memory.
- *qubes_guid* creates a shared memory segment, and then tells Xorg to attach it via *MIT-SHM* extension
- when Xorg tries to attach the segment (via glibc *shmat*) *shmoverride.so* intercepts this call and instead maps AppVM memory via *xc_map_foreign_pages*
- since then, we can use MIT-SHM functions, e.g. *XShmPutImage* to draw onto a dom0 window. *XShmPutImage* will paint with DRAM speed; actually, many drivers use DMA for this.
The important detail is that *xc_map_foreign_pages* verifies that a given mfn range actually belongs to a given domain id (and the latter is provided by trusted *qubes_guid*).
Therefore, rogue AppVM cannot gain anything by passing crafted mnfs in the `MFNDUMP` message.
To sum up, this solution has the following benefits:
- window updates at DRAM speed
- no changes to Xorg code
- minimal size of the supporting code
![gui.png](/attachment/wiki/GUIdocs/gui.png)
Security markers on dom0 windows
--------------------------------
It is important that the user knows which AppVM a given window belongs to. This prevents a rogue AppVM from painting a window pretending to belong to other AppVM or dom0 and trying to steal, for example, passwords.
In Qubes, a custom window decorator is used that paints a colourful frame (the colour is determined during AppVM creation) around decorated windows. Additionally, the window title always starts with **[name of the AppVM]**. If a window has an *override_redirect* attribute, meaning that it should not be treated by a window manager (typical case is menu windows), *qubes_guid* draws a two-pixel colourful frame around it manually.
Clipboard sharing implementation
--------------------------------
Certainly, it would be insecure to allow AppVM to read/write the clipboards of other AppVMs unconditionally.
Therefore, the following mechanism is used:
- there is a "qubes clipboard" in dom0 - its contents are stored in a regular file in dom0.
- if the user wants to copy local AppVM clipboard to qubes clipboard, she must focus on any window belonging to this AppVM, and press **Ctrl-Shift-C**. This combination is trapped by *qubes-guid*, and `CLIPBOARD_REQ` message is sent to AppVM. *qubes-gui* responds with *CLIPBOARD_DATA* message followed by clipboard contents.
- the user focuses on other AppVM window, presses **Ctrl-Shift-V**. This combination is trapped by *qubes-guid*, and `CLIPBOARD_DATA` message followed by qubes clipboard contents is sent to AppVM; *qubes_gui* copies data to the local clipboard, and then user can paste its contents to local applications normally.
This way, the user can quickly copy clipboards between AppVMs.
This action is fully controlled by the user, it cannot be triggered/forced by any AppVM.
*qubes_gui* and *qubes_guid* code notes
-----------------------------------------
Both applications are structured similarly. They use *select* function to wait for any of these two event sources:
- messages from the local X server
- messages from the vchan connecting to the remote party
The XEvents are handled by the *handle_xevent_eventname* function, and messages are handled by *handle_messagename* function. One should be very careful when altering the actual *select* loop, because both XEvents and vchan messages are buffered, and *select* will not wake for each message.
If one changes the number/order/signature of messages, one should increase the *QUBES_GUID_PROTOCOL_VERSION* constant in *messages.h* include file.
*qubes_guid* writes debugging information to */var/log/qubes/qubes.domain_id.log* file; *qubes_gui* writes debugging information to */var/log/qubes/gui_agent.log*.
Include these files when reporting a bug.
AppVM -> dom0 messages
-----------------------
Proper handling of the below messages is security-critical.
Observe that beside two messages (`CLIPBOARD` and `MFNDUMP`) the rest have fixed size, so the parsing code can be small.
The *override_redirect* window attribute is explained at [Override Redirect Flag](https://tronche.com/gui/x/xlib/window/attributes/override-redirect.html). The *transient_for* attribute is explained at [Transient_for attribute](https://tronche.com/gui/x/icccm/sec-4.html#WM_TRANSIENT_FOR).
Window manager hints and flags are described in the [Extended Window Manager Hints (EWMH) spec](https://standards.freedesktop.org/wm-spec/latest/), especially under the `_NET_WM_STATE` section.
Each message starts with the following header:
~~~
struct msghdr {
uint32_t type;
uint32_t window;
/* This field is intended for use by gui_agents to skip unknown
* messages from the (trusted) guid. Guid, on the other hand,
* should never rely on this field to calculate the actual len of
* message to be read, as the (untrusted) agent can put here
* whatever it wants! */
uint32_t untrusted_len;
};
~~~
This header is followed by message-specific data:
<table class="table">
<tr>
<th>Message name</th>
<th>Structure after header</th>
<th>Action</th>
</tr>
<tr>
<td>MSG_CLIPBOARD_DATA</td>
<td>amorphic blob (in protocol before 1.2, length determined by the "window" field, in 1.2 and later - by untrusted_len in the header)</td>
<td>Store the received clipboard content (not parsing in any way)</td>
</tr>
<tr>
<td>MSG_CREATE</td>
<td><pre>
struct msg_create {
uint32_t x;
uint32_t y;
uint32_t width;
uint32_t height;
uint32_t parent;
uint32_t override_redirect;
};
</pre>
</td>
<td>Create a window with given parameters</td>
</tr>
<tr>
<td>MSG_DESTROY</td>
<td>None</td>
<td>Destroy a window</td>
</tr>
<tr>
<td>MSG_MAP</td>
<td><pre>
struct msg_map_info {
uint32_t transient_for;
uint32_t override_redirect;
};
</pre></td>
<td>Map a window with given parameters</td>
</tr>
<tr>
<td>MSG_UNMAP</td>
<td>None</td>
<td>Unmap a window</td>
</tr>
<tr>
<td>MSG_CONFIGURE</td>
<td><pre>
struct msg_configure {
uint32_t x;
uint32_t y;
uint32_t width;
uint32_t height;
uint32_t override_redirect;
};
</pre></td>
<td>Change window position/size/type</td>
</tr>
<tr>
<td>MSG_MFNDUMP</td>
<td><pre>
struct shm_cmd {
uint32_t shmid;
uint32_t width;
uint32_t height;
uint32_t bpp;
uint32_t off;
uint32_t num_mfn;
uint32_t domid;
uint32_t mfns[0];
};
</pre></td>
<td>Retrieve the array of mfns that constitute the composition buffer of a remote window.
The "num_mfn" 32bit integers follow the shm_cmd structure; "off" is the offset of the composite buffer start in the first frame; "shmid" and "domid" parameters are just placeholders (to be filled by *qubes_guid*), so that we can use the same structure when talking to *shmoverride.so*|
</td>
</tr>
<tr>
<td>MSG_SHMIMAGE</td>
<td><pre>
struct msg_shmimage {
uint32_t x;
uint32_t y;
uint32_t width;
uint32_t height;
};
</pre> </td>
<td>Repaint the given window fragment</td>
</tr>
<tr>
<td>MSG_WMNAME</td>
<td><pre>
struct msg_wmname {
char data[128];
};
</pre></td>
<td>Set the window name; only printable characters are allowed</td>
</tr>
<tr>
<td>MSG_DOCK</td>
<td>None</td>
<td>Dock the window in the tray</td>
</tr>
<tr>
<td>MSG_WINDOW_HINTS</td>
<td><pre>
struct msg_window_hints {
uint32_t flags;
uint32_t min_width;
uint32_t min_height;
uint32_t max_width;
uint32_t max_height;
uint32_t width_inc;
uint32_t height_inc;
uint32_t base_width;
uint32_t base_height;
};
</pre> </td>
<td>Size hints for window manager</td>
</tr>
<tr>
<td>MSG_WINDOW_FLAGS</td>
<td><pre>
struct msg_window_flags {
uint32_t flags_set;
uint32_t flags_unset;
};
</pre> </td>
<td>Change window state request; fields contains bitmask which flags request to be set and which unset</td>
</tr>
</table>
Dom0 -> AppVM messages
-----------------------
Proper handling of the below messages is NOT security-critical.
Each message starts with the following header
~~~
struct msghdr {
uint32_t type;
uint32_t window;
};
~~~
The header is followed by message-specific data:
<table class="table">
<tr>
<th>Message name</th>
<th>Structure after header</th>
<th>Action</th>
</tr>
<tr>
<td>MSG_KEYPRESS</td>
<td><pre>
struct msg_keypress {
uint32_t type;
uint32_t x;
uint32_t y;
uint32_t state;
uint32_t keycode;
};
</pre> </td>
<td>Tell *qubes_drv* driver to generate a keypress</td>
</tr>
<tr>
<td>MSG_BUTTON</td>
<td><pre>
struct msg_button {
uint32_t type;
uint32_t x;
uint32_t y;
uint32_t state;
uint32_t button;
};
</pre> </td>
<td>Tell *qubes_drv* driver to generate mouseclick</td>
</tr>
<tr>
<td>MSG_MOTION</td>
<td><pre>
struct msg_motion {
uint32_t x;
uint32_t y;
uint32_t state;
uint32_t is_hint;
};
</pre> </td>
<td>Tell *qubes_drv* driver to generate motion event</td>
</tr>
<tr>
<td>MSG_CONFIGURE</td>
<td><pre>
struct msg_configure {
uint32_t x;
uint32_t y;
uint32_t width;
uint32_t height;
uint32_t override_redirect;
};
</pre> </td>
<td>Change window position/size/type</td>
</tr>
<tr>
<td>MSG_MAP</td>
<td><pre>
struct msg_map_info {
uint32_t transient_for;
uint32_t override_redirect;
};
</pre> </td>
<td>Map a window with given parameters</td>
</tr>
<tr>
<td>MSG_CLOSE</td>
<td>None</td>
<td>send wmDeleteMessage to the window</td>
</tr>
<tr>
<td>MSG_CROSSING</td>
<td><pre>
struct msg_crossing {
uint32_t type;
uint32_t x;
uint32_t y;
uint32_t state;
uint32_t mode;
uint32_t detail;
uint32_t focus;
};
</pre> </td>
<td>Notify window about enter/leave event</td>
</tr>
<tr>
<td>MSG_FOCUS</td>
<td><pre>
struct msg_focus {
uint32_t type;
uint32_t mode;
uint32_t detail;
};
</pre> </td>
<td>Raise a window, XSetInputFocus</td>
</tr>
<tr>
<td>MSG_CLIPBOARD_REQ</td>
<td>None</td>
<td>Retrieve the local clipboard, pass contents to gui-daemon</td>
</tr>
<tr>
<td>MSG_CLIPBOARD_DATA</td>
<td>amorphic blob</td>
<td>Insert the received data into local clipboard</td>
</tr>
<tr>
<td>MSG_EXECUTE</td>
<td>Obsolete</td>
<td>Obsolete, unused</td>
</tr>
<tr>
<td>MSG_KEYMAP_NOTIFY</td>
<td> unsigned char remote_keys[32]; </td>
<td>Synchronize the keyboard state (key pressed/released) with dom0</td>
</tr>
<tr>
<td>MSG_WINDOW_FLAGS</td>
<td><pre>
struct msg_window_flags {
uint32_t flags_set;
uint32_t flags_unset;
};
</pre> </td>
<td>Window state change confirmation</td>
</tr>
</table>
`KEYPRESS`, `BUTTON`, `MOTION`, `FOCUS` messages pass information extracted from dom0 XEvent; see appropriate event documentation.

View file

@ -0,0 +1,71 @@
---
layout: doc
title: Networking
permalink: /doc/networking/
redirect_from:
- /doc/qubes-net/
- /en/doc/qubes-net/
- /doc/QubesNet/
- /wiki/QubesNet/
---
VM network in Qubes
===================
Overall description
-------------------
In Qubes, the standard Xen networking is used, based on backend driver in the driver domain and frontend drivers in VMs. In order to eliminate layer 2 attacks originating from a compromised VM, routed networking is used instead of the default bridging of `vif` devices and NAT is applied at each network hop. The default *vif-route* script had some deficiencies (requires `eth0` device to be up, and sets some redundant iptables rules), therefore the custom *vif-route-qubes* script is used.
The IP address of `eth0` interface in AppVM, as well as two IP addresses to be used as nameservers (`DNS1` and `DNS2`), are passed via QubesDB to AppVM during its boot (thus, there is no need for DHCP daemon in the network driver domain). `DNS1` and `DNS2` are private addresses; whenever an interface is brought up in the network driver domain, the */usr/lib/qubes/qubes\_setup\_dnat\_to\_ns* script sets up the DNAT iptables rules translating `DNS1` and `DNS2` to the newly learned real dns servers. This way AppVM networking configuration does not need to be changed when configuration in the network driver domain changes (e.g. user switches to a different WLAN). Moreover, in the network driver domain, there is no DNS server either, and consequently there are no ports open to the VMs.
Routing tables examples
-----------------------
VM routing table is simple:
||
|Destination|Gateway|Genmask|Flags|Metric|Ref|Use|Iface|
|0.0.0.0|0.0.0.0|0.0.0.0|U|0|0|0|eth0|
Network driver domain routing table is a bit longer:
||
|Destination|Gateway|Genmask|Flags|Metric|Ref|Use|Iface|
|10.137.0.16|0.0.0.0|255.255.255.255|UH|0|0|0|vif4.0|
|10.137.0.7|0.0.0.0|255.255.255.255|UH|0|0|0|vif10.0|
|10.137.0.9|0.0.0.0|255.255.255.255|UH|0|0|0|vif9.0|
|10.137.0.8|0.0.0.0|255.255.255.255|UH|0|0|0|vif8.0|
|10.137.0.12|0.0.0.0|255.255.255.255|UH|0|0|0|vif3.0|
|192.168.0.0|0.0.0.0|255.255.255.0|U|1|0|0|eth0|
|0.0.0.0|192.168.0.1|0.0.0.0|UG|0|0|0|eth0|
IPv6
----
Starting with Qubes 4.0, there is opt-in support for IPv6 forwarding. Similar to the IPv4, traffic is routed and NAT is applied at each network gateway. This way we avoid reconfiguring every connected qube whenever uplink connection is changed, and even telling the qube what that uplink is - which may be complex when VPN or other tunneling services are employed.
The feature can be enabled on any network-providing qube, and will be propagated down the network tree, so every qube connected to it will also have IPv6 enabled.
To enable the `ipv6` feature use `qvm-features` tool and set the value to `1`. For example to enable it on `sys-net`, execute in dom0:
qvm-features sys-net ipv6 1
It is also possible to explicitly disable IPv6 support for some qubes, even if it is connected to IPv6-providing one. This can be done by setting `ipv6` feature to empty value:
qvm-features ipv4-only-qube ipv6 ''
This configuration is presented below - green qubes have IPv6 access, red one does not.
![ipv6-1](/attachment/wiki/IPv6/ipv6-1.png)
In that case, system uplink connection have native IPv6. But in some cases it may not be true. Then some tunneling solution can be used (for example teredo). The same will apply when the user is connected to VPN service providing IPv6 support, regardless of user's internet connection.
Such configuration can be expressed by enabling `ipv6` feature only on some subset of Qubes networking, for example by creating separate qube to encapsulate IPv6 traffic and setting `ipv6` to `1` only there. See diagram below
![ipv6-2](/attachment/wiki/IPv6/ipv6-2.png)
Besides enabling IPv6 forwarding, standard Qubes firewall can be used to limit what network resources are available to each qube. Currently only `qvm-firewall` command support adding IPv6 rules, GUI firewall editor will have this ability later.
### Limitations ###
Currently only IPv4 DNS servers are configured, regardless of `ipv6` feature state. It is done this way to avoid reconfiguring all connected qubes whenever IPv6 DNS becomes available or not. Configuring qubes to always use IPv6 DNS and only fallback to IPv4 may result in relatively long timeouts and poor usability.
But note that DNS using IPv4 does not prevent to return IPv6 addresses. In practice this is only a problem for IPv6-only networks.

View file

@ -0,0 +1,81 @@
---
layout: doc
title: Security-critical Code
permalink: /doc/security-critical-code/
redirect_from:
- /en/doc/security-critical-code/
- /doc/SecurityCriticalCode/
- /wiki/SecurityCriticalCode/
- /trac/wiki/SecurityCriticalCode/
---
Security-critical Code in Qubes OS
==================================
Below is a list of security-critical (i.e., trusted) code components in Qubes OS.
A successful attack against any of these components could compromise the system's security.
This code can be thought of as the Trusted Computing Base (TCB) of Qubes OS.
One of the main goals of the project is to keep the TCB to an absolute minimum.
The size of the current TCB is on the order order of hundreds of thousands of lines of C code, which is several orders of magnitude less than other OSes.
(In Windows, Linux, and Mac OSes, the amount of trusted code is typically on the order of tens of *millions* of lines of C code.)
For more information, see [Qubes Security Goals].
Security-critical Qubes-specific Components
-------------------------------------------
The following code components are security-critical in Qubes OS:
- Dom0-side of the libvchan library
- Dom0-side of the GUI virtualization code (`qubes-guid`)
- Dom0-side of the sound virtualization code (`pacat-simple-vchan`)
- Dom0-side in qrexec-related code (`qrexec_daemon`)
- VM memory manager (`qmemman`) that runs in Dom0
- Select Qubes RPC servers that run in Dom0: `qubes.ReceiveUpdates` and `qubes.SyncAppMenus`
- The `qubes.Filecopy` RPC server that runs in a VM (critical because it could allow one VM to compromise another if the user allows a file copy operation to be performed between them)
Security-critical Third-party Components
----------------------------------------
We did not create these components, but Qubes OS relies on them.
At the current stage of the project, we cannot afford to spend the time to thoroughly review and audit them, so we more or less "blindly" trust that they are secure.
- The Xen hypervisor
- Xen's xenstore backend running in Dom0
- Xen's block backend running in Dom0's kernel
- The RPM program used in Dom0 for verifying signatures on dom0 updates
- Somewhat trusted: log viewing software in dom0 that parses VM-influenced logs
Attacks on Networking Components
--------------------------------
Here are two examples of networking components that an adversary might seek to attack (or in which to exploit a vulnerability as part of an attack):
- Xen network PV frontends
- VMs' core networking stacks (core TCP/IP code)
Hypothetically, an adversary could compromise a NetVM, `sys-net-1`, and try to use it to attack the VMs connected to that NetVM.
However, Qubes allows for the existence of more than one NetVM, so the adversary would not be able to use `sys-net-1` in order to attack VMs connected to a *different* NetVM, `sys-net-2` without also compromising `sys-net-2`.
In addition, the adversary would not be able to use `sys-net-1` (or, for that matter, `sys-net-2`) to attack VMs that have networking disabled (i.e., VMs that are not connected to any NetVM).
Buggy Code vs. Backdoored Code
------------------------------
There is an important distinction between buggy code and maliciously backdoored code.
We could have the most secure architecture and the most bulletproof TCB that perfectly isolates all domains from each other, but it would all be pretty useless if all the code we ran inside our domains (e.g. the code in our email clients, word processors, and web browsers) were backdoored.
In that case, only network-isolated domains would be somewhat trustworthy.
This means that we must trust at least some of the vendors that supply the code we run inside our domains.
(We don't have to trust *all* of them, but we at least have to trust the few that provide the apps we use in the most critical domains.)
In practice, we trust the software provided by the [Fedora Project].
This software is signed by Fedora distribution keys, so it is also critical that the tools used in domains for software updates (`dnf` and `rpm`) are trustworthy.
[Qubes Security Goals]: /security/goals/
[Fedora Project]: https://getfedora.org/
[Understanding and Preventing Data Leaks]: /doc/data-leaks/

View file

@ -0,0 +1,53 @@
---
layout: doc
title: Storage Pools
permalink: /doc/storage-pools/
---
Storage Pools in Qubes
======================
Qubes OS R3.2 introduced the concept of storage drivers and pools. This feature
was a first step towards a saner storage API, which is heavily rewritten in R4.
See [here](https://dev.qubes-os.org/projects/core-admin/en/latest/qubes-storage.html)
for documentation on storage pools in R4.
A storage driver provides a way to store VM images in a Qubes OS system.
Currently, the default driver is `xen` which is the default way of storing
volume images as files in a directory tree like `/var/lib/qubes/`.
A storage pool driver can be identified either by the driver name with the
`driver` key or by the class name like this:
`class=qubes.storage.xen.XenStorage`. Because R3.2 doesn't use Python
`setup_hooks`, to actually use a short driver name for a custom storage driver,
you have to patch `qubes-core-admin`. You can use the `class` config key
instead, when your class is accessible by `import` in Python.
A pool (in R3.2) is configuration information which can be referenced when
creating a new VM. Each pool is saved in `storage.conf`. It has a name, a
storage driver and some driver specific configuration attached.
When installed, the system has, as you can see from the contents of
`/etc/qubes/storage.conf`, a pool named `default`. It uses the driver `xen`. The
default pool is special in R3.2. It will add `dir_path=/var/lib/qubes`
configuration value from `defaults[pool_config]`, if not overwritten.
Currently the only supported driver out of the box is `xen`. The benefit of
pools (besides that you can write your own storage driver e.g. for Btrfs) in R3.2
is that you can store your domains in multiple places.
You can add a pool to `storage.conf` like this:
```
[foo]
driver=xen
dir_path=/opt/qubes-vm
```
Now, when creating a new VM on the command-line, you may pass the `-Pfoo`
argument to `qvm-create` to have the VM images stored in pool `foo`. See also
`qvm-create --help`.
While the current API is not as clean and beautiful as the R4 API, it allows
you to write your own storage drivers e.g. for Btrfs today.

View file

@ -0,0 +1,12 @@
---
layout: doc
title: System Documentation
permalink: /doc/system-doc/
redirect_from:
- /en/doc/system-doc/
- /doc/SystemDoc/
- /wiki/SystemDoc/
redirect_to:
- /doc/#developer-documentation
---

View file

@ -0,0 +1,113 @@
---
layout: doc
title: Template Implementation
permalink: /doc/template-implementation/
redirect_from:
- /en/doc/template-implementation/
- /doc/TemplateImplementation/
- /wiki/TemplateImplementation/
---
Overview of VM block devices
============================
Every VM has 4 block devices connected:
- **xvda** base root device (/) details described below
- **xvdb** private.img place where VM always can write.
- **xvdc** volatile.img, discarded at each VM restart here is placed swap and temporal "/" modifications (see below)
- **xvdd** modules.img kernel modules and firmware
private.img (xvdb)
------------------
This is mounted as /rw and here is placed all VM private data. This includes:
- */home* which is bind mounted to /rw/home
- */usr/local* which is symlink to /rw/usrlocal
- some config files (/rw/config) called by qubes core scripts (ex /rw/config/rc.local)
**Note:** Whenever a TemplateBasedVM is created, the contents of the `/home` directory of its parent TemplateVM are copied to the child TemplateBasedVM's `/home`. From that point onward, the child TemplateBasedVM's `/home` is independent from its parent TemplateVM's `/home`, which means that any subsequent changes to the parent TemplateVM's `/home` will no longer affect the child TemplateBasedVM's `/home`. Once a TemplateBasedVM has been created, any changes in its `/home`, `/usr/local`, or `/rw/config` directories will be persistent across reboots, which means that any files stored there will still be available after restarting the TemplateBasedVM. No changes in any other directories in TemplateBasedVMs persist in this manner. If you would like to make changes in other directories which *do* persist in this manner, you must make those changes in the parent TemplateVM.
modules.img (xvdd)
------------------
As the kernel is chosen in dom0, there must be some way to provide matching kernel modules to VM. Qubes kernel directory consists of 3 files:
- *vmlinuz* actual kernel
- *initramfs* initial ramdisk containing script to setup snapshot devices (see below) and mount /lib/modules
- *modules.img* filesystem image of /lib/modules with matching kernel modules and firmware (/lib/firmware/updates is symlinked to /lib/modules/firmware)
Normally kernel "package" is common for many VMs (can be set using qvm-prefs). One of them can be set as default (qvm-set-default-kernel) to simplify kernel updates (by default all VMs use the default kernel). All installed kernels are placed in /var/lib/qubes/vm-kernels as separate subdirs. In this case, modules.img is attached to the VM as R/O device.
There is a special case when the VM can have a custom kernel when it is updateable (StandaloneVM or TemplateVM) and the kernel is set to "none" (by qvm-prefs). In this case the VM uses the kernel from the "kernels" VM subdir and modules.img is attached as R/W device. FIXME: "none" should be renamed to "custom".
Qubes TemplateVM implementation
===============================
TemplateVM has a shared root.img across all AppVMs that are based on it. This mechanism has some advantages over a simple common device connected to multiple VMs:
- root.img can be modified while there are AppVMs running without corrupting the filesystem
- multiple AppVMs that are using different versions of root.img (from various points in time) can be running concurrently
There are two layers of the device-mapper snapshot device; the first one enables modifying root.img without stopping the AppVMs and the second one, which is contained in the AppVM, enables temporal modifications to its filesystem. These modifications will be discarded after a restart of the AppVM.
![TemplateSharing2.png](/attachment/wiki/TemplateImplementation/TemplateSharing2.png)
Snapshot device in Dom0
-----------------------
This device consists of:
- root.img real template filesystem
- root-cow.img differences between the device as seen by AppVM and the current root.img
The above is achieved through creating device-mapper snapshots for each version of root.img. When an AppVM is started, a xen hotplug script (/etc/xen/scripts/block-snapshot) reads the inode numbers of root.img and root-cow.img; these numbers are used as the snapshot device's name. When a device with the same name exists the new AppVM will use it therefore, AppVMs based on the same version of root.img will use the same device. Of course, the device-mapper cannot use the files directly it must be connected through /dev/loop\*. The same mechanism detects if there is a loop device associated with a file determined by the device and inode numbers or if creating a new loop device is necessary.
When an AppVM is stopped the xen hotplug script checks whether the device is still in use if it is not, the script removes the snapshot and frees the loop device.
### Changes to template filesystem
In order for the full potential of the snapshot device to be realized, every change in root.img must save the original version of the modified block in root-cow.img. This is achieved by a snapshot-origin device.
When TemplateVM is started, it receives the snapshot-origin device connected as a root device (in read-write mode). Therefore, every change to this device is immediately saved in root.img but remains invisible to the AppVM, which uses the snapshot.
When TemplateVM is stopped, the xen script moves root-cow.img to root-cow.img.old and creates a new one (using the `qvm-template-commit` tool). The snapshot device will remain untouched due to the loop device, which uses an actual file on the disk (by inode, not by name). Linux kernel frees the old root-cow.img files as soon as they are unused by all snapshot devices (to be exact, loop devices). The new root-cow.img file will get a new inode number, and so new AppVMs will get new snapshot devices (with different names).
### Rollback template changes
There is possibility to rollback last template changes. Saved root-cow.img.old contains all changes made during last TemplateVM run. Rolling back changes is done by reverting this "binary patch".
This is done using snapshot-merge device-mapper target (available from 2.6.34 kernel). It requires that no other snapshot device uses underlying block devices (root.img, root-cow.img via loop device). Because of this all AppVMs based on this template must be halted during this operation.
Steps performed by **qvm-revert-template-changes**:
1. Ensure that no other VMs uses this template.
2. Prepare snapshot device with ***root-cow.img.old*** instead of *root-cow.img* (*/etc/xen/scripts/block-snapshot prepare*).
3. Replace *snapshot* device-mapper target with *snapshot-merge*, other parameters (chunk size etc) remains untouched. Now kernel starts merging changes stored in *root-cow.img.old* into *root.img*. d-m device can be used normally (if needed).
4. Waits for merge completed: *dmsetup status* shows used snapshot blocks it should be equal to metadata size when completed.
5. Replace *snapshot-merge* d-m target back to *snapshot*.
6. Cleanup snapshot device (if nobody uses it at the moment).
7. Move *root-cow.img.old* to *root-cow.img* (overriding existing file).
Snapshot device in AppVM
------------------------
Root device is exposed to AppVM in read-only mode. AppVM can write only in:
- private.img persistent storage (mounted in /rw) used for /home, /usr/local in future versions, its use may be extended
- volatile.img temporary storage, which is discarded after an AppVM restart
volatile.img is divided into two partitions:
1. changes to root device
2. swap partition
Inside of an AppVM, the root device is wrapped by the snapshot in the first partition of volatile.img. Therefore, the AppVM can write anything to its filesystem however, such changes will be discarded after a restart.
StandaloneVM
------------
Standalone VM enables user to modify root filesystem persistently. It can be created using *--standalone* switch to *qvm-create*.
It is implemented just like TemplateVM (has own root.img connected as R/W device), but no other VMs can be based on it.