forked-synapse

mirror of https://mau.dev/maunium/synapse.git synced 2024-10-01 01:36:05 -04:00

Author	SHA1	Message	Date
Richard van der Hoff	5c445114d3	Correctly account for cpu usage by background threads (#4074 ) Wrap calls to deferToThread() in a thing which uses a child logcontext to attribute CPU usage to the right request. While we're in the area, remove the logcontext_tracer stuff, which is never used, and afaik doesn't work. Fixes #4064	2018-10-23 13:12:32 +01:00
Amber Brown	e1728dfcbe	Make scripts/ and scripts-dev/ pass pyflakes (and the rest of the codebase on py3) (#4068 )	2018-10-20 11:16:55 +11:00
Amber Brown	e404ba9aac	Fix manhole on py3 (pt 2) (#4067 )	2018-10-19 22:26:00 +11:00
Amber Brown	a36b0ec195	make a bytestring	2018-10-19 09:24:00 +11:00
Erik Johnston	6982320572	Remove unnecessary extra function call layer	2018-10-08 14:06:19 +01:00
Erik Johnston	8a1817f0d2	Use errback pattern and catch async failures	2018-10-08 13:29:47 +01:00
Erik Johnston	f7199e8734	Log looping call exceptions If a looping call function errors, then it kills the loop entirely. Currently it throws away the exception logs, so we should make it actually log them. Fixes #3929	2018-10-05 11:24:12 +01:00
Erik Johnston	4f3e3ac192	Correctly match 'dict.pop' api	2018-10-01 12:25:27 +01:00
Erik Johnston	8ea887856c	Don't update eviction metrics on explicit removal	2018-10-01 12:00:58 +01:00
Richard van der Hoff	9c8cec5dab	Merge remote-tracking branch 'origin/develop' into erikj/destination_retry_cache	2018-09-28 10:51:09 +01:00
Richard van der Hoff	4a15a3e4d5	Include eventid in log lines when processing incoming federation transactions (#3959 ) when processing incoming transactions, it can be hard to see what's going on, because we process a bunch of stuff in parallel, and because we may end up recursively working our way through a chain of three or four events. This commit creates a way to use logcontexts to add the relevant event ids to the log lines.	2018-09-27 11:25:34 +01:00
Richard van der Hoff	5b4028fa78	Merge branch 'rav/fix_expiring_cache_len' into erikj/destination_retry_cache	2018-09-26 12:55:53 +01:00
Richard van der Hoff	7ee94fc1ba	Log which cache is throwing exceptions	2018-09-26 12:43:08 +01:00
Erik Johnston	3baf6e1667	Fix ExpiringCache.__len__ to be accurate It used to try and produce an estimate, which was sometimes negative. This caused metrics to be sad, so lets always just calculate it from scratch. (This appears to have been a longstanding bug, but one which has been made more of a problem by #3932 and #3933). (This was originally done by Erik as part of #3933. I'm cherry-picking it because really it's a fix in its own right)	2018-09-26 12:32:29 +01:00
Erik Johnston	19dc676d1a	Fix ExpiringCache.__len__ to be accurate It used to try and produce an estimate, which was sometimes negative. This caused metrics to be sad, so lets always just calculate it from scratch.	2018-09-21 16:25:42 +01:00
Erik Johnston	fdd1a62e8d	Add a five minute cache to get_destination_retry_timings Hopefully helps with #3931	2018-09-21 14:56:12 +01:00
Erik Johnston	79eded1ae4	Make ExpiringCache slightly more performant	2018-09-21 14:52:21 +01:00
Erik Johnston	8601c24287	Fix some instances of ExpiringCache not expiring cache items ExpiringCache required that `start()` be called before it would actually start expiring entries. A number of places didn't do that. This PR removes `start` from ExpiringCache, and automatically starts backround reaping process on creation instead.	2018-09-21 14:19:46 +01:00
Richard van der Hoff	642199570c	Improve the logging when handling a federation transaction (#3904 ) Let's try to rationalise the logging that happens when we are processing an incoming transaction, to make it easier to figure out what is going wrong when they take ages. In particular: - make everything start with a [room_id event_id] prefix - make sure we log a warning when catching exceptions rather than just turning them into other, more cryptic, exceptions.	2018-09-19 17:28:18 +01:00
Erik Johnston	9407bcf37a	Replace custom DeferredTimeoutError with defer.TimeoutError	2018-09-19 11:07:29 +01:00
Erik Johnston	6c48aa0256	Run canceller first to allow it to generate correct error	2018-09-19 11:07:27 +01:00
Erik Johnston	a334e1cace	Update to use new timeout function everywhere. The existing deferred timeout helper function (and the one into twisted) suffer from a bug when a deferred's canceller throws an exception, #3842. The new helper function doesn't suffer from this problem.	2018-09-19 10:39:40 +01:00
Erik Johnston	24efb2a70d	Fix timeout function Turns out deferred.cancel sometimes throws, so we do that last to ensure that we always do resolve the new deferred.	2018-09-15 11:38:39 +01:00
Erik Johnston	fcfe7a850d	Add an awful secondary timeout to fix wedged requests This is an attempt to mitigate #3842 by adding yet-another-timeout	2018-09-14 19:23:07 +01:00
Erik Johnston	0a81038ea0	Add in flight real time metrics for Measure blocks	2018-09-14 15:08:37 +01:00
Erik Johnston	9e05c8d309	Change the manhole SSH key to have more bits Newer versions of openssh client refuse to connect to the old key due to its length.	2018-09-11 10:42:10 +01:00
Richard van der Hoff	be6527325a	Fix exceptions when a connection is closed before we read the headers This fixes bugs introduced in #3700, by making sure that we behave sanely when an incoming connection is closed before the headers are read.	2018-08-20 18:21:10 +01:00
Richard van der Hoff	55e6bdf287	Robustness fix for logcontext filter Make the logcontext filter not explode if it somehow ends up with a logcontext of None, since that infinite-loops the whole logging system.	2018-08-20 18:20:07 +01:00
Amber Brown	324525f40c	Port over enough to get some sytests running on Python 3 (#3668 )	2018-08-20 23:54:49 +10:00
Richard van der Hoff	c31793a784	Merge branch 'rav/fix_linearizer_cancellation' into develop	2018-08-10 14:57:27 +01:00
Amber Brown	b37c472419	Rename async to async_helpers because `async` is a keyword on Python 3.7 (#3678 )	2018-08-10 23:50:21 +10:00
Richard van der Hoff	638d35ef08	Fix linearizer cancellation on twisted < 18.7 Turns out that cancellation of inlineDeferreds didn't really work properly until Twisted 18.7. This commit refactors Linearizer.queue to avoid inlineCallbacks.	2018-08-10 10:59:09 +01:00
Amber Brown	da7785147d	Python 3: Convert some unicode/bytes uses (#3569 )	2018-08-02 00:54:06 +10:00
Richard van der Hoff	a8cbce0ced	fix invalidation	2018-07-27 16:17:17 +01:00
Richard van der Hoff	f102c05856	Rewrite cache list decorator Because it was complicated and annoyed me. I suspect this will be more efficient too.	2018-07-27 13:47:04 +01:00
Richard van der Hoff	03751a6420	Fix some looping_call calls which were broken in #3604 It turns out that looping_call does check the deferred returned by its callback, and (at least in the case of client_ips), we were relying on this, and I broke it in #3604. Update run_as_background_process to return the deferred, and make sure we return it to clock.looping_call.	2018-07-26 11:48:08 +01:00
Richard van der Hoff	3d6df84658	Test and fix support for cancellation in Linearizer	2018-07-20 13:59:55 +01:00
Richard van der Hoff	7c712f95bb	Combine Limiter and Linearizer Linearizer was effectively a Limiter with max_count=1, so rather than maintaining two sets of code, let's combine them.	2018-07-20 13:11:43 +01:00
Richard van der Hoff	8462c26485	Improvements to the Limiter * give them names, to improve logging * use a deque rather than a list for efficiency	2018-07-20 12:50:27 +01:00
Richard van der Hoff	d7275eecf3	Add a sleep to the Limiter to fix stack overflows. Fixes #3570	2018-07-20 12:37:12 +01:00
Amber Brown	95ccb6e2ec	Don't spew errors because we can't save metrics (#3563 )	2018-07-19 20:58:18 +10:00
Richard van der Hoff	8c69b735e3	Make Distributor run its processes as a background process This is more involved than it might otherwise be, because the current implementation just drops its logcontexts and runs everything in the sentinel context. It turns out that we aren't actually using a bunch of the functionality here (notably suppress_failures and the fact that Distributor.fire returns a deferred), so the easiest way to fix this is actually by simplifying a bunch of code.	2018-07-18 20:55:05 +01:00
Richard van der Hoff	667fba68f3	Run things as background processes This fixes #3518, and ensures that we get useful logs and metrics for lots of things that happen in the background. (There are certainly more things that happen in the background; these are just the common ones I've found running a single-process synapse locally).	2018-07-18 20:55:05 +01:00
Erik Johnston	b2aa05a8d6	Use efficient .intersection	2018-07-17 11:07:04 +01:00
Erik Johnston	547b1355d3	Fix perf regression in PR #3530 The get_entities_changed function was changed to return all changed entities since the given stream position, rather than only those changed from a given list of entities. This resulted in the function incorrectly returning large numbers of entities that, for example, caused large increases in database usage.	2018-07-17 10:27:51 +01:00
Amber Brown	3fe0938b76	Merge pull request #3530 from matrix-org/erikj/stream_cache Don't return unknown entities in get_entities_changed	2018-07-17 13:44:46 +10:00
Richard van der Hoff	33b40d0a25	Make FederationRateLimiter queue requests properly popitem removes the most recent item by default [1]. We want the oldest. Fixes #3524 [1]: https://docs.python.org/2/library/collections.html#collections.OrderedDict.popitem	2018-07-13 16:19:40 +01:00
Erik Johnston	77b692e65d	Don't return unknown entities in get_entities_changed The stream cache keeps track of all entities that have changed since a particular stream position, so get_entities_changed does not need to return unknown entites when given a larger stream position. This makes it consistent with the behaviour of has_entity_changed.	2018-07-13 15:26:10 +01:00
Richard van der Hoff	fa5c2bc082	Reduce set building in get_entities_changed This line shows up as about 5% of cpu time on a synchrotron: not_known_entities = set(entities) - set(self._entity_to_key) Presumably the problem here is that _entity_to_key can be largeish, and building a set for its keys every time this function is called is slow. Here we rewrite the logic to avoid building so many sets.	2018-07-12 11:37:44 +01:00
Richard van der Hoff	c3c29aa196	Attempt to include db threads in cpu usage stats (#3496 ) Let's try to include time spent in the DB threads in the per-request/block cpu usage metrics.	2018-07-10 16:12:36 +01:00

1 2 3 4 5 ...

487 Commits