With lots of VMs updating, the firewall quit with:
2017-04-23 20:47:52 -00:00: INF [frameQ] Queue length for 10.137.3.11: incr to 474
2017-04-23 20:47:52 -00:00: INF [memory_pressure] Writing meminfo: free 2648 / 17504 kB (15.13 %)
[...]
Fatal error: out of memory.
The firewall will now drop frames when more than 10 are queued (note
that queuing only starts once the network driver's transmit buffer is
already full).
Fedora 24 doesn't work with opam (because the current binary release of
aspcud's clasp binary segfaults, which opam reports as `External solver
failed with inconsistent return value.`).
In particular, this:
- Adds support for ICMP queries and errors.
- Uses an LRU cache to avoid running out of memory and needing to reset
the table.
- Passes around parsed packets rather than raw ethernet frames.
Qubes does not remove the client directory itself when the domain exits.
Combined with 63cbb4bed0, this prevented clients from reconnecting.
This may also make it possible to connect clients to the firewall via
multiple interfaces, although this doesn't seem useful.
We previously assumed that Qubes would always give clients IP addresses
on a particular network. However, it is not required to do this and in
fact uses a different network for disposable VMs.
With this change:
- We no longer reject clients with unknown IP addresses
- The `Unknown_client` classification is gone; we have no way to tell
the difference between a client that isn't connected and an external
address.
- We now consider every client to be on a point-to-point link and do not
answer ARP requests on behalf of other clients. Clients should assume
their netmask is 255.255.255.255 (and ignore /qubes-netmask).
This is a partial fix for #9. It allows disposable VMs to connect to the
firewall but for some reason they don't process any frames we send them
(we get their ARP requests but they don't get our replies). Taking eth0
down in the disp VM, then bringing it back up (and re-adding the routes)
allows it to work.
- Unpin bootvar and use register ~argv:no_argv` instead.
- Use new name for uplink device ("0", not "tap0").
- Don't configure logging - mirage does that for us now.
The ocamlfind package has started listing this as a required dependency
for some reason, although it appears not to need it.
Fixes#4, reported by cyrinux.
The callback function was partially applied, meaning that it always used
the NAT table that was in use when processing started, even if the OOM
handler had replaced the table by then. This meant that the retry
attempt would always fail, since it tried to add it to the existing full
table, and also prevented that table from being GC'd.
Before, when resetting the NAT table to handle an out-of-memory
condition we tried to allocate the new table while still holding
the reference to the old one. It should be more reliable to drop
the old reference first.
Log showed:
2016-01-31 19:33.47: INF [firewall] added NAT redirect 10.137.3.12:32860 -> 53:firewall:52517 -> 53:net-vm
2016-01-31 19:33.52: WRN [firewall] Out_of_memory adding NAT rule. Dropping NAT table...
--- End dump ---
Fatal error: exception Out of memory
Raised by primitive operation at file "hashtbl.ml", line 63, characters 52-70
Called from file "router.ml", line 47, characters 11-30
Called from file "src/core/lwt.ml", line 907, characters 20-24
Mirage exiting with status 2
Do_exit called!