This sets up the infrastructure for moving from storing nginx access
logs in journald to plain text files written by syslog-ng and rotated by
logrotate. This works around the poor performance, poor space efficiency
and lack of archived log compression for journald. Unlike writing access
logs directly with nginx, this continues avoiding blocking writes in the
event loop and sticks to asynchronous sends through a socket.
Since nginx only supports syslog via the RFC 3164 protocol rather than
the more modern RFC 5424 protocol, this leaves formatting timestamps up
to nginx rather than using the ones provided via the syslog protocol.
These servers originally only had the 1Gbps base bandwidth and shaping
it with CAKE worked well to make the most of it during traffic spikes
for the web servers. It has little value for the nameservers since the
only potentially high throughput service is non-interactive SSH.
These servers now have 10Gbps burst available but are heavily limited by
their single virtual core and unable to use all of it in practice. CAKE
can only provide significant value when it's the bottleneck which isn't
the case when the workload is CPU limited. We don't want to keep around
the artificially low 1Gbps limit and it can't do much more.
Unlike OVH, the practical bottleneck is the CPU and FQ has the lowest
CPU usage in practice due to being very performance-oriented with a FIFO
fast path and offloading TCP pacing from the TCP stack to itself. On the
DNS servers, the fast path is always used in practice. Our OVH servers
have a much lower enforced bandwidth limit and the way they implement it
ruins fairness across flows. We definitely want to stick with CAKE for
our VPS instances on OVH but it doesn't make sense on BuyVM anymore.
It makes more sense to rotate session ticket keys every 8 hours instead
of doing it at 3 specific times each day where the initial rotation will
happen earlier than necessary. It makes little difference due to keeping
the previous 3 session tickets valid but is cleaner.
This needs to be configured by specific services to have any effect. For
now, we're only enabling it for the PowerDNS Authoritative Server and
dnsdist since it's recommended by RFC 9210 and actively used by various
recursive resolver servers when falling back to TCP. TCP Fast Open is
rarely used from end user devices due to it enabling tracking and having
issues with middleboxes. We aren't going to start using it anywhere in
GrapheneOS but may have more server-side uses for it. This functionality
is built into QUIC without the same downsides but QUIC support in the
software we use is not ready for us to enable it, especially the very
primitive support in nginx.
For most servers, a new random TCP Fast Open key is created on a daily
basis and the previous key continues to be accepted. For DNS servers,
the new key is generated via a keyed hash of the current date in order
to keep it consistent across servers providing an anycast IP without it
needing regular synchronization.
1.releases.grapheneos.org and 2.releases.grapheneos.org were ending up
with only 6 channels by default despite the hardware being capable of
far more. This raises it to match the 24 CPU threads.
0.releases.grapheneos.org is already using 32 channels by default which
matches the 32 CPU threads.
We can no longer use OCSP stapling and Must-Staple. These will soon be
obsolete once the `shortlived` profile is available for public use since
it will provide certificates with a similar lifetime as OCSP responses.
In the meantime, we've moved to the `tlsserver` profile stripping legacy
features to prepare for the `shortlived` profile which will be identical
to `tlsserver` but with a validity period of 6 days.
The certificate for SUPL is still temporarily using the classic profile
to work around the older generations of end-of-life Snapdragon Pixels
not having support for SNI. We can eventually drop support for these
devices from the SUPL service to allow us to disable TLSv1.1, DHE and
move to the `tlsserver` or `shortlived` profile.
The certificate for SMTP is still temporarily using the classic profile
to avoid potential compatibility issues with servers supporting TLSv1.2
but still not yet supporting SNI.