This allows us to handle /context/ requests on the client_reader worker
without having to pull in all the various stream handlers (e.g.
precence, typing, pushers etc). The only thing the token gets used for
is pagination, and that ignores everything but the room portion of the
token.
First of all, fix the logic which looks for identical input state groups so
that we actually use them. This turned out to be most easily done by factoring
the relevant code out to a separate function so that we could do an early
return.
Secondly, avoid building the whole `conflicted_state` dict (which was only ever
used as a boolean flag).
Thirdly, replace the construction of the `state` dict (which mapped from keys
to events that set them), with an optimistic construction of the resolution
result assuming there will be no conflicts. This should be no slower than
building the old `state` dict, and:
- in the conflicted case, we'll short-cut it, saving part of the work
- in the unconflicted case, it saves rebuilding the resolution from the
`state` dict.
Finally, do a couple of s/values/itervalues/.
We don't want to bother pulling out the current state from the DB since
until we know we have to. Checking the context for state is just an
optimisation.
This is more involved than it might otherwise be, because the current
implementation just drops its logcontexts and runs everything in the sentinel
context.
It turns out that we aren't actually using a bunch of the functionality here
(notably suppress_failures and the fact that Distributor.fire returns a
deferred), so the easiest way to fix this is actually by simplifying a bunch of
code.
This fixes#3518, and ensures that we get useful logs and metrics for lots of
things that happen in the background.
(There are certainly more things that happen in the background; these are just
the common ones I've found running a single-process synapse locally).
This introduces a mechanism for tracking resource usage by background
processes, along with an example of how it will be used.
This will help address #3518, but more importantly will give us better insights
into things which are happening but not being shown up by the request metrics.
We *could* do this with Measure blocks, but:
- I think having them pulled out as a completely separate metric class will
make it easier to distinguish top-level processes from those which are
nested.
- I want to be able to report on in-flight background processes, and I don't
think we want to do this for *all* Measure blocks.
The get_entities_changed function was changed to return all changed
entities since the given stream position, rather than only those changed
from a given list of entities. This resulted in the function incorrectly
returning large numbers of entities that, for example, caused large
increases in database usage.
parse_integer and parse_string can take a request and raise errors
in case we have wrong or missing params.
This PR tries to use them more to deduplicate some code and make it
better readable
The stream cache keeps track of all entities that have changed since
a particular stream position, so get_entities_changed does not need to
return unknown entites when given a larger stream position.
This makes it consistent with the behaviour of has_entity_changed.
This line shows up as about 5% of cpu time on a synchrotron:
not_known_entities = set(entities) - set(self._entity_to_key)
Presumably the problem here is that _entity_to_key can be largeish, and
building a set for its keys every time this function is called is slow.
Here we rewrite the logic to avoid building so many sets.
Previously we queued up the poke correctly when the device was deleted,
but then the actual EDU wouldn't get sent, as the device was no longer known.
Instead, we now send EDUs for deleted devices too if there's a poke for them.
This default config is parsed and used a base before the actual
config is overlaid, so with these values not commented out, the
code to detect when no turn params were set and refuse to generate
credentials was never firing because the dummy default was always set.
We need to do a bit more validation when we get a server name, but don't want
to be re-doing it all over the shop, so factor out a separate
parse_and_validate_server_name, and do the extra validation.
Also, use it to verify the server name in the config file.
Make sure that server_names used in auth headers are sane, and reject them with
a sensible error code, before they disappear off into the depths of the system.
otherwise we explode with:
```
Traceback (most recent call last):
File /usr/lib/python2.7/logging/handlers.py, line 78, in emit
logging.FileHandler.emit(self, record)
File /usr/lib/python2.7/logging/__init__.py, line 950, in emit
StreamHandler.emit(self, record)
File /usr/lib/python2.7/logging/__init__.py, line 887, in emit
self.handleError(record)
File /usr/lib/python2.7/logging/__init__.py, line 810, in handleError
None, sys.stderr)
File /usr/lib/python2.7/traceback.py, line 124, in print_exception
_print(file, 'Traceback (most recent call last):')
File /usr/lib/python2.7/traceback.py, line 13, in _print
file.write(str+terminator)
File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_io.py, line 170, in write
self.log.emit(self.level, format=u{log_io}, log_io=line)
File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_logger.py, line 144, in emit
self.observer(event)
File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_observer.py, line 136, in __call__
errorLogger = self._errorLoggerForObserver(brokenObserver)
File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_observer.py, line 156, in _errorLoggerForObserver
if obs is not observer
File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_observer.py, line 81, in __init__
self.log = Logger(observer=self)
File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_logger.py, line 64, in __init__
namespace = self._namespaceFromCallingContext()
File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_logger.py, line 42, in _namespaceFromCallingContext
return currentframe(2).f_globals[__name__]
File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/python/compat.py, line 93, in currentframe
for x in range(n + 1):
RuntimeError: maximum recursion depth exceeded while calling a Python object
Logged from file site.py, line 129
File /usr/lib/python2.7/logging/__init__.py, line 859, in emit
msg = self.format(record)
File /usr/lib/python2.7/logging/__init__.py, line 732, in format
return fmt.format(record)
File /usr/lib/python2.7/logging/__init__.py, line 471, in format
record.message = record.getMessage()
File /usr/lib/python2.7/logging/__init__.py, line 335, in getMessage
msg = msg % self.args
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 4: ordinal not in range(128)
Logged from file site.py, line 129
```
...where the logger apparently recurses whilst trying to log the error, hitting the
maximum recursion depth and killing everything badly.
Most rooms have a trivial history visibility like "shared" or
"world_readable", especially large rooms, so lets not bother getting the
full membership of those rooms in that case.
When _get_state_for_groups is given a wildcard filter, just do a complete
lookup. Hopefully this will give us the best of both worlds by not filling up
the ram if we only need one or two keys, but also making the cache still work
for the federation reader usecase.
When we finish processing a request, log the number of events we fetched from
the database to handle it.
[I'm trying to figure out which requests are responsible for large amounts of
event cache churn. It may turn out to be more helpful to add counts to the
prometheus per-request/block metrics, but that is an extension to this code
anyway.]
when there is no `m.room.power_levels` event in force in the room. (PR #3397)
Discussion around the Matrix Spec change proposal for this change can be
followed at https://github.com/matrix-org/matrix-doc/issues/1304.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJbIop9AAoJEIofk9V1tejV9lsIAJVH0l5dXROmy1KH/zt16AUA
CXa6Vv4Vyo6hKad/fZ81OZVRr5ChK/TvbIJVn/SA/muCfdoIFdxhT8eo/pXzO2UW
zReuLsDhAg+gSvpNus37oWj2FVsAE1HYDZ60lfaapAdZnkFit68d5DQZjO6nZHHA
YUXcU3GUwj0ZYuUzFzYKMLu6uNNasNkN8h6SS2lF7Bm4JaKDW+mFMfCyJwdIVSEh
BGhHoVpXdxFysD9s6Mwxqrz3KKg1Jtp7idDkk0x2S2Eh+gxyiDQQokv0oQ3+0+HG
sgy5Iz2t2CkpS02/j+LOvAZljTmnD0bXu3srGR+25StsoDFP038Am3bfQwtD190=
=9jsT
-----END PGP SIGNATURE-----
Merge tag 'v0.31.2'
SECURITY UPDATE: Prevent unauthorised users from setting state events in a room
when there is no `m.room.power_levels` event in force in the room. (PR #3397)
Discussion around the Matrix Spec change proposal for this change can be
followed at https://github.com/matrix-org/matrix-doc/issues/1304.
This is only used by filter_events_for_client, so we can simplify the whole
thing by just doing one user at a time, and removing a dead storage function to
boot.
=======================================
v0.31.1 fixes a security bug in the ``get_missing_events`` federation API
where event visibility rules were not applied correctly.
We are not aware of it being actively exploited but please upgrade asap.
Bug Fixes:
* Fix event filtering in get_missing_events handler (PR #3371)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEETQ1YthIGLQRddG54CTxDAAxPS/QFAlsalP8ACgkQCTxDAAxP
S/RADw/+NeDu0LjVpS5Uc4ElHgRBuFSm6l2i4z8rZBBlKSYnuq0Em4WMvLloi/JF
iAvTOYE7OjmF+gNvmsdH1N7hc1lKdQ2gAlpvaQR/5Qz9NtOVmM3WPZxS7n5jZHvD
hVSxeO9+GQOwK7rJorqrrsnWHQt0OkLHV6WThFdrgZb1JjWCUDTvw+Hei2uMX2aq
y2mkMG4TLStHwMvL2qw0h+hFtXywXI796qJR73ZxbEn24YD+kOXeEVkIFi2LT0Pj
cgkg7WWT32JD43/ioumLupZuhCmpRyxn4fi5gIKpXe5kiLsxOdApzNQSwmoJ+WA8
7zlrWY+0QDN4pbA5ESLitWSWAT50Ul//uM4nwmM4xEBPHdljXvKyHPsaSCeKLvT9
RT8pc41TQAqSshlXF8zgIAtStnF3oGel3EBBl1mmM9Un1ULnBFwWDZhlIm+ZZGhJ
MWoAWNG7j8AQuy0BTUAUr76x7t+/cdSqDuyVl1GO1tbDh0DUWoHZGXCUKrAXnn2T
SbiFigwOLvEADbvkW7L9Je9CVOi2V5Pg/32X9O8YMEiSz+j5PQEiGefyVh/I/QvV
Ha/atRpZF2OZ+XUOO5DLZMP/XCXpVgvHuskzfU6LvvVQCXgExuJhRD1PrRGeBaWr
zjJW+rmY+VeredqX7QkTB3XOGLMLGJfx2FeSMb+j0w3a7/iFj+Q=
=J80S
-----END PGP SIGNATURE-----
Merge tag 'v0.31.1'
Changes in synapse v0.31.1 (2018-06-08)
=======================================
v0.31.1 fixes a security bug in the ``get_missing_events`` federation API
where event visibility rules were not applied correctly.
We are not aware of it being actively exploited but please upgrade asap.
Bug Fixes:
* Fix event filtering in get_missing_events handler (PR #3371)
Firstly, don't swallow the reason for the failure
Secondly, don't assume all exceptions are verification failures
Thirdly, log a bit of info about the key being used if debug is enabled
These "temporary fixes" have been here three and a half years, and I can't find
any events in the matrix.org database where the calculated signature differs
from what's in the db. It's time for them to go away.