Since we're no longer storing nginx logs in journald, we no longer need
to use journald configuration to control nginx log rotation/retention.
We switched from nginx to dnsdist for the authoritative DNS servers and
are therefore no longer logging any of the queries persistently since we
can rely on the PowerDNS and dnsdist in-memory buffers and stats.
We can use nginx-specific logrotate configuration on a per-server basis
based on balancing the usefulness of access logs with storage space and
getting rid of slightly sensitive data faster (mainly IP addresses).
The error log is fairly quiet during regular use but can end up logging
one or more lines per request during DDoS attacks. Errors are logged for
worker_connections depletion and limit_conn rejections. There's also
currently an nginx bug with modern TLS and OpenSSL causing some client
side TLS errors to be logged as crit instead of info.
This provides more redundancy for both services through having 2
instances in each region. The network services have much higher
bandwidth usage and load so this will also delay us needing to obtain
new servers by making better use of the ones we have.
This sets up the infrastructure for moving from storing nginx access
logs in journald to plain text files written by syslog-ng and rotated by
logrotate. This works around the poor performance, poor space efficiency
and lack of archived log compression for journald. Unlike writing access
logs directly with nginx, this continues avoiding blocking writes in the
event loop and sticks to asynchronous sends through a socket.
Since nginx only supports syslog via the RFC 3164 protocol rather than
the more modern RFC 5424 protocol, this leaves formatting timestamps up
to nginx rather than using the ones provided via the syslog protocol.
The previous commit works around a long term systemd bug which recently
began impacting us again. If the workaround stops working, the behavior
should not be stalling boot forever. Swap isn't needed for our servers
to function so it shouldn't break them if it can't be set up.
This closes a small window where new workers could give keys not
accepted by the old workers before they're gracefully shut down. This
will also be needed when syncing keys across a cluster.