Unbound now requests 4M for the send buffer by default and we might as
well permit that for both the send and receive buffers. We set the max
auto-tuned send buffer size on a per-server basis but don't currently
have much use for tuning the maximum manually specified buffer size
across servers. It can be moved in the future if needed.
This needs to be configured by specific services to have any effect. For
now, we're only enabling it for the PowerDNS Authoritative Server and
dnsdist since it's recommended by RFC 9210 and actively used by various
recursive resolver servers when falling back to TCP. TCP Fast Open is
rarely used from end user devices due to it enabling tracking and having
issues with middleboxes. We aren't going to start using it anywhere in
GrapheneOS but may have more server-side uses for it. This functionality
is built into QUIC without the same downsides but QUIC support in the
software we use is not ready for us to enable it, especially the very
primitive support in nginx.
For most servers, a new random TCP Fast Open key is created on a daily
basis and the previous key continues to be accepted. For DNS servers,
the new key is generated via a keyed hash of the current date in order
to keep it consistent across servers providing an anycast IP without it
needing regular synchronization.
BBRv1 provides much better throughput in many cases and is particularly
useful for our update servers. The fairness issues based on round trip
time are not a major issue for us. The fairness issues for competing
with traditional loss-based congestion control are relevant to us but it
seems to benefit it more than it hurts us. BBRv3 will fix most of this
while preserving nearly all the benefits and will likely be shipped as a
replacement for BBRv1 in the Linux kernel rather than another option.
The reason we rolled it back last time was seeing cases of the initial
bandwidth estimate being overly low combined with a very bad interaction
with synproxy causing low bandwidth initially. We've partially addressed
the synproxy issue by raising the synproxy threshold based on conntrack
table size which we're now fully scaling based on available memory. If
we decide this is still a significant issue, we can limit using BBRv1 to
our update servers where it has massive benefits and the least downside
due to initial bandwidth not being as important. BBRv3 will help with
this by probing Round Trip Time every 5 seconds instead of 10 seconds
but still has similar issues.
BBRv1 significantly improves throughput in some cases but it also
significantly reduces it in others. We've run into too many network
conditions it handles quite poorly. There's also a bad interaction
between BBR and synproxy where it will cripple the initial throughput
for connections established via synproxy. This means a basic SYN flood
attack could cripple initial TCP throughput for most connections.
Android doesn't enable ECN for outbound connections yet and we don't
want to deviate from that so it mainly only gets activated for macOS
and iOS clients. Linux kernel approach to ECN hasn't been modernized and
there are fierce debates about how it should work. It can cause issues
and it seems best to avoid it until Android enables it.