Commit graph

70 commits

Author SHA1 Message Date
Daniel Micay
029ec73c3c networkd: set PreferredLifetime=0 for anycast IPs
This avoids these being used for outbound connections.
2025-11-21 11:31:48 -05:00
Daniel Micay
a0ba527f9d remove gra1.grapheneos.org and las0.grapheneos.org 2025-11-21 11:31:48 -05:00
Daniel Micay
1fad7ca6cd add fra.grapheneos.org and hio.grapheneos.org servers
These were previously 2 of our 4 OVH ns1.grapheneos.org instances. Our
ns1.grapheneos.network network has been entirely moved to Vultr for BGP
support so we're reusing these 2 instances as replacements for 2 of the
existing grapheneos.org servers.
2025-11-21 11:31:48 -05:00
Daniel Micay
209b1b5def add lon.ns1.grapheneos.org 2025-11-21 11:31:48 -05:00
Daniel Micay
d2dcec7e02 ns2: add IPv4 address from our anycast /24 2025-11-21 11:31:48 -05:00
Daniel Micay
0dfb05852f networkd: add comments for anycast addresses 2025-11-21 11:31:48 -05:00
Daniel Micay
bb86e16179 networkd: remove unnecessary [Address] sections 2025-11-21 11:31:48 -05:00
Daniel Micay
5adb170069 add mia.ns2.grapheneos.org server 2025-11-21 11:31:48 -05:00
Daniel Micay
649e2b53c4 replace remaining OVH ns1 servers with Vultr 2025-11-21 11:31:48 -05:00
Daniel Micay
066fdd0d09 add IPv6 address from our /48 announced from BuyVM 2025-11-21 11:31:48 -05:00
Daniel Micay
fe999c541a add IPv6 address from our /48 announced from Vultr 2025-11-21 11:31:48 -05:00
Daniel Micay
5256f2e4a4 replace 1.ns1.grapheneos.org server with sea.ns1.grapheneos.org 2025-11-21 11:31:48 -05:00
Daniel Micay
f95fa51821 add lax.ns1.grapheneos.org server 2025-11-21 11:31:48 -05:00
Daniel Micay
951662aeca replace 0.ns1.grapheneos.org server with nyc.ns1.grapheneos.org 2025-11-21 11:31:48 -05:00
Daniel Micay
4aba8d355a add mia.ns1.grapheneos.org server 2025-11-21 11:31:48 -05:00
Daniel Micay
ebd44c9253 grapheneos.org: switch to location-based server names 2025-11-21 11:31:48 -05:00
Daniel Micay
e3bcb9e87f ns2.grapheneos.org: switch to location-based server names 2025-11-21 11:31:48 -05:00
Daniel Micay
93e1d3866b releases.grapheneos.org: switch to location-based server names 2025-11-21 11:31:48 -05:00
Daniel Micay
8af52e3498 journald: revert back to default SystemMaxFiles
This was raised to 10000 to work around 2 separate journald bugs causing
premature rotation which have been resolved for a long time.
2025-11-04 13:45:16 -05:00
Daniel Micay
7f0982f9d7 journald: disable ForwardToWall 2025-11-04 11:51:00 -05:00
Daniel Micay
f1ff8ac931 phase out 2.releases.grapheneos.org 2025-11-04 11:19:13 -05:00
Daniel Micay
8697cf2a2d switch back to unified journald rotation/retention
Since we're no longer storing nginx logs in journald, we no longer need
to use journald configuration to control nginx log rotation/retention.

We switched from nginx to dnsdist for the authoritative DNS servers and
are therefore no longer logging any of the queries persistently since we
can rely on the PowerDNS and dnsdist in-memory buffers and stats.

We can use nginx-specific logrotate configuration on a per-server basis
based on balancing the usefulness of access logs with storage space and
getting rid of slightly sensitive data faster (mainly IP addresses).
2025-11-03 20:03:59 -05:00
Daniel Micay
2caa67529a set up syslog-ng for nginx access log
This sets up the infrastructure for moving from storing nginx access
logs in journald to plain text files written by syslog-ng and rotated by
logrotate. This works around the poor performance, poor space efficiency
and lack of archived log compression for journald. Unlike writing access
logs directly with nginx, this continues avoiding blocking writes in the
event loop and sticks to asynchronous sends through a socket.

Since nginx only supports syslog via the RFC 3164 protocol rather than
the more modern RFC 5424 protocol, this leaves formatting timestamps up
to nginx rather than using the ones provided via the syslog protocol.
2025-11-03 00:33:28 -05:00
Daniel Micay
a346146625 reorder update servers 2025-11-01 20:04:51 -04:00
Daniel Micay
01305667bd remove legacy 2.releases.grapheneos.org IPv6 address 2025-10-31 00:38:22 -04:00
Daniel Micay
7fa179260f phase in new IPv6 address for 2.releases.grapheneos.org 2025-10-30 20:11:17 -04:00
Daniel Micay
0d1705320f use consistent naming for session ticket key scripts/units 2025-10-30 17:06:07 -04:00
Daniel Micay
9fde84c877 add initial session ticket key synchronization 2025-10-30 14:22:55 -04:00
Daniel Micay
f9430a1aeb add script for deploying certbot replication setup 2025-10-30 14:22:32 -04:00
Daniel Micay
8340cf2813 add workaround for system encrypted swap race
This appeared to be solved a while ago but ended up returning.
2025-10-29 22:36:11 -04:00
Daniel Micay
0b519d6f5e set AccuracySec=1us for tcp-fastopen-rotate-keys 2025-10-28 12:33:10 -04:00
Daniel Micay
9ed61cef61 reduce TLS session ticket key interval from 8h to 6h 2025-10-27 22:50:32 -04:00
Daniel Micay
ce0942702e add RemainAfterExit=yes to create-session-ticket-keys.service 2025-10-27 22:11:22 -04:00
Daniel Micay
448565de54 update description for rotate-session-ticket-keys.timer 2025-10-27 21:19:32 -04:00
Daniel Micay
c4af821eda always create /var/cache/nginx for web servers
This avoids needing to restart nginx for ReadWritePaths to kick in after
creating it.
2025-10-27 20:52:34 -04:00
Daniel Micay
f8a1d381e7 mdmonitor.service: use syslog reporting 2025-10-19 16:16:33 -04:00
Daniel Micay
f2a4df1d0f add another IPv6 address for 0.releases.grapheneos.org
This will be used to send more traffic to it via DNS RRset load
balancing.
2025-10-11 15:31:09 -04:00
Daniel Micay
5ea8e202a1 0.releases.grapheneos.org IPv4 update
The main IPv4 address has changed and we're now using an additional IPv4
address to send more traffic to it via DNS RRset load balancing.
2025-10-11 15:30:35 -04:00
Daniel Micay
02b7e4e5c1 add 3.releases.grapheneos.org server 2025-10-09 09:06:31 -04:00
Daniel Micay
48d939d39d adjust IPv6 subnet size for ReliableSite servers 2025-10-05 00:50:18 -04:00
Daniel Micay
348cdf9d74 update systemd configuration 2025-09-18 11:17:05 -04:00
Daniel Micay
c6156ebed7 switch from shaped CAKE to FQ for BuyVM servers
These servers originally only had the 1Gbps base bandwidth and shaping
it with CAKE worked well to make the most of it during traffic spikes
for the web servers. It has little value for the nameservers since the
only potentially high throughput service is non-interactive SSH.

These servers now have 10Gbps burst available but are heavily limited by
their single virtual core and unable to use all of it in practice. CAKE
can only provide significant value when it's the bottleneck which isn't
the case when the workload is CPU limited. We don't want to keep around
the artificially low 1Gbps limit and it can't do much more.

Unlike OVH, the practical bottleneck is the CPU and FQ has the lowest
CPU usage in practice due to being very performance-oriented with a FIFO
fast path and offloading TCP pacing from the TCP stack to itself. On the
DNS servers, the fast path is always used in practice. Our OVH servers
have a much lower enforced bandwidth limit and the way they implement it
ruins fairness across flows. We definitely want to stick with CAKE for
our VPS instances on OVH but it doesn't make sense on BuyVM anymore.
2025-09-18 01:26:39 -04:00
Daniel Micay
d923bc7e24 use monotonic timer for session ticket key rotation
It makes more sense to rotate session ticket keys every 8 hours instead
of doing it at 3 specific times each day where the initial rotation will
happen earlier than necessary. It makes little difference due to keeping
the previous 3 session tickets valid but is cleaner.
2025-09-15 21:10:42 -04:00
Daniel Micay
35ca9a2a19 allow server TCP Fast Open and rotate the keys
This needs to be configured by specific services to have any effect. For
now, we're only enabling it for the PowerDNS Authoritative Server and
dnsdist since it's recommended by RFC 9210 and actively used by various
recursive resolver servers when falling back to TCP. TCP Fast Open is
rarely used from end user devices due to it enabling tracking and having
issues with middleboxes. We aren't going to start using it anywhere in
GrapheneOS but may have more server-side uses for it. This functionality
is built into QUIC without the same downsides but QUIC support in the
software we use is not ready for us to enable it, especially the very
primitive support in nginx.

For most servers, a new random TCP Fast Open key is created on a daily
basis and the previous key continues to be accepted. For DNS servers,
the new key is generated via a keyed hash of the current date in order
to keep it consistent across servers providing an anycast IP without it
needing regular synchronization.
2025-09-15 21:10:39 -04:00
Daniel Micay
46fe2fd36c add CAP_CHOWN to certbot-renew.service for dnsdist 2025-09-05 02:06:01 -04:00
Daniel Micay
ca22d4a0a3 enable adaptive-rx on ReliableSite update servers
This is fully supported by the Broadcom NIC used for both servers but
not enabled by default. It's already enabled by default for the Intel
NIC used by the Macarne update server.
2025-09-04 16:48:17 -04:00
Daniel Micay
ece7064674 raise NIC channels to number of threads
1.releases.grapheneos.org and 2.releases.grapheneos.org were ending up
with only 6 channels by default despite the hardware being capable of
far more. This raises it to match the 24 CPU threads.

0.releases.grapheneos.org is already using 32 channels by default which
matches the 32 CPU threads.
2025-09-04 01:00:22 -04:00
Daniel Micay
e9fda8e7a1 map packet priority 4 to the high priority fq band 2025-09-01 19:35:49 -04:00
Daniel Micay
adf8269ac2 switch CAKE to diffserv4 now that DSCP marks are correct 2025-09-01 19:35:49 -04:00
Daniel Micay
f3ae87143f set handle for CAKE 2025-08-28 20:06:46 -04:00