Commit Graph

386 Commits

Author SHA1 Message Date
Richard van der Hoff
aab2e4da60
Merge pull request #3140 from matrix-org/rav/use_run_in_background
Use run_in_background in preference to preserve_fn
2018-04-30 00:34:28 +01:00
Richard van der Hoff
fc149b4eeb Merge remote-tracking branch 'origin/develop' into rav/use_run_in_background 2018-04-27 14:31:23 +01:00
Richard van der Hoff
6146332387 Merge remote-tracking branch 'origin/develop' into rav/deferred_timeout 2018-04-27 14:18:00 +01:00
Richard van der Hoff
2a13af23bc Use run_in_background in preference to preserve_fn
While I was going through uses of preserve_fn for other PRs, I converted places
which only use the wrapped function once to use run_in_background, to avoid
creating the function object.
2018-04-27 12:55:51 +01:00
Richard van der Hoff
9d2c1b8429 Backport deferred.addTimeout
Twisted 16.0 doesn't have addTimeout, so let's backport it.
2018-04-27 12:52:30 +01:00
Richard van der Hoff
9255a6cb17 Improve exception handling for background processes
There were a bunch of places where we fire off a process to happen in the
background, but don't have any exception handling on it - instead relying on
the unhandled error being logged when the relevent deferred gets
garbage-collected.

This is unsatisfactory for a number of reasons:
 - logging on garbage collection is best-effort and may happen some time after
   the error, if at all
 - it can be hard to figure out where the error actually happened.
 - it is logged as a scary CRITICAL error which (a) I always forget to grep for
   and (b) it's not really CRITICAL if a background process we don't care about
   fails.

So this is an attempt to add exception handling to everything we fire off into
the background.
2018-04-27 11:07:40 +01:00
Richard van der Hoff
1ea904b9f0 Use deferred.addTimeout instead of time_bound_deferred
This doesn't feel like a wheel we need to reinvent.
2018-04-23 00:53:18 +01:00
Richard van der Hoff
8dc4a6144b
Merge pull request #3107 from NotAFile/py3-bool-nonzero
add __bool__ alias to __nonzero__ methods
2018-04-20 15:43:39 +01:00
Richard van der Hoff
c09a6daf09
Merge pull request #3110 from NotAFile/py3-six-queue
Replace Queue with six.moves.queue
2018-04-20 15:35:00 +01:00
Richard van der Hoff
11a67b7c9d
Merge pull request #3093 from matrix-org/rav/response_cache_wrap
Refactor ResponseCache usage
2018-04-20 11:31:17 +01:00
Adrian Tschira
878995e660 Replace Queue with six.moves.queue
and a six.range change which I missed the last time

Signed-off-by: Adrian Tschira <nota@notafile.com>
2018-04-16 00:46:21 +02:00
Adrian Tschira
f63ff73c7f add __bool__ alias to __nonzero__ methods
Signed-off-by: Adrian Tschira <nota@notafile.com>
2018-04-15 20:40:47 +02:00
Richard van der Hoff
d3347ad485 Revert "Use sortedcontainers instead of blist"
This reverts commit 9fbe70a7dc.

It turns out that sortedcontainers.SortedDict is not an exact match for
blist.sorteddict; in particular, `popitem()` removes things from the opposite
end of the dict.

This is trivial to fix, but I want to add some unit tests, and potentially some
more thought about it, before we do so.
2018-04-13 11:16:43 +01:00
Richard van der Hoff
60f6014bb7 ResponseCache: fix handling of completed results
Turns out that ObservableDeferred.observe doesn't return a deferred if the
result is already completed. Fix handling and improve documentation.
2018-04-13 07:32:29 +01:00
Richard van der Hoff
b78395b7fe Refactor ResponseCache usage
Adds a `.wrap` method to ResponseCache which wraps up the boilerplate of a
(get, set) pair, and then use it throughout the codebase.

This will be largely non-functional, but does include the following functional
changes:

* federation_server.on_context_state_request: drops use of _server_linearizer
  which looked redundant and could cause incorrect cache misses by yielding
  between the get and the set.
* RoomListHandler.get_remote_public_room_list(): fixes logcontext leaks
* the wrap function includes some logging. I'm hoping this won't be too noisy
  on production.
2018-04-12 13:02:15 +01:00
Richard van der Hoff
d5c74b9f6c
Merge pull request #3092 from matrix-org/rav/response_cache_metrics
Add metrics for ResponseCache
2018-04-12 12:59:36 +01:00
Richard van der Hoff
261124396e
Merge pull request #3059 from matrix-org/rav/doc_response_cache
Document the behaviour of ResponseCache
2018-04-12 11:22:30 +01:00
Richard van der Hoff
b3384232a0 Add metrics for ResponseCache 2018-04-10 23:14:47 +01:00
Vincent Breitmoser
9fbe70a7dc Use sortedcontainers instead of blist
This commit drop-in replaces blist with SortedContainers. They are
written in pure python so work with pypy, but perform as good as
native implementations, at least in a couple benchmarks:

http://www.grantjenks.com/docs/sortedcontainers/performance.html
2018-04-10 11:29:51 +02:00
Richard van der Hoff
13decdbf96 Revert "Merge pull request #3066 from matrix-org/rav/remove_redundant_metrics"
We aren't ready to release this yet, so I'm reverting it for now.

This reverts commit d1679a4ed7, reversing
changes made to e089100c62.
2018-04-09 12:59:12 +01:00
Richard van der Hoff
3449da3bc7
Merge pull request #3068 from matrix-org/rav/fix_cache_invalidation
Improve database cache performance
2018-04-05 17:21:44 +01:00
Richard van der Hoff
01afc563c3 Fix overzealous cache invalidation
Fixes an issue where a cache invalidation would invalidate *all* pending
entries, rather than just the entry that we intended to invalidate.
2018-04-05 16:24:04 +01:00
Richard van der Hoff
518f6de088 Remove redundant metrics which were deprecated in 0.27.0. 2018-04-04 19:46:28 +01:00
Richard van der Hoff
a9a74101a4 Document the behaviour of ResponseCache
it looks like everything that uses ResponseCache expects to have to
`make_deferred_yieldable` its results. It's debatable whether that is the best
approach, but let's document it for now to avoid further confusion.
2018-04-04 09:06:22 +01:00
Richard van der Hoff
05630758f2 Use static JSONEncoders
using json.dumps with custom options requires us to create a new JSONEncoder on
each call. It's more efficient to create one upfront and reuse it.
2018-03-29 23:13:33 +01:00
Matthew Hodgson
8cbbfaefc1 404 correctly on missing paths via NoResource
fixes https://github.com/matrix-org/synapse/issues/2043 and https://github.com/matrix-org/synapse/issues/2029
2018-03-23 10:32:50 +00:00
Erik Johnston
9a0d783c11 Add comments 2018-03-19 11:35:53 +00:00
Erik Johnston
7c7706f42b Fix bug where state cache used lots of memory
The state cache bases its size on the sum of the size of entries. The
size of the entry is calculated once on insertion, so it is important
that the size of entries does not change.

The DictionaryCache modified the entries size, which caused the state
cache to incorrectly think it was smaller than it actually was.
2018-03-15 15:46:54 +00:00
Richard van der Hoff
20f40348d4 Factor run_in_background out from preserve_fn
It annoys me that we create temporary function objects when there's really no
need for it. Let's factor the gubbins out of preserve_fn and start using it.
2018-03-08 11:50:11 +00:00
Richard van der Hoff
3a75de923b Rewrite make_deferred_yieldable avoiding inlineCallbacks
... because (a) it's actually simpler (b) it might be marginally more
performant?
2018-03-01 12:40:05 +00:00
Richard van der Hoff
bc496df192 report metrics on number of cache evictions 2018-02-05 15:34:01 +00:00
Matthew Hodgson
ab9f844aaf
Add federation_domain_whitelist option (#2820)
Add federation_domain_whitelist

gives a way to restrict which domains your HS is allowed to federate with.
useful mainly for gracefully preventing a private but internet-connected HS from trying to federate to the wider public Matrix network
2018-01-22 19:11:18 +01:00
Matthew Hodgson
d84f65255e
Merge pull request #2813 from matrix-org/matthew/registrations_require_3pid
add registrations_require_3pid and allow_local_3pids
2018-01-22 13:57:22 +00:00
Matthew Hodgson
8fe253f19b fix PR nitpicking 2018-01-19 18:23:45 +00:00
Matthew Hodgson
447f4f0d5f rewrite based on PR feedback:
* [ ] split config options into allowed_local_3pids and registrations_require_3pid
 * [ ] simplify and comment logic for picking registration flows
 * [ ] fix docstring and move check_3pid_allowed into a new util module
 * [ ] use check_3pid_allowed everywhere

@erikjohnston PTAL
2018-01-19 15:33:55 +00:00
Erik Johnston
b6dc7044a9
Merge pull request #2804 from matrix-org/erikj/file_consumer
Add decent impl of a FileConsumer
2018-01-18 16:31:33 +00:00
Richard van der Hoff
d57765fc8a Fix bugs in block metrics
... which I introduced in #2785
2018-01-18 12:24:42 +00:00
Erik Johnston
be0dfcd4a2 Do logcontexts correctly 2018-01-18 11:57:57 +00:00
Erik Johnston
1432f7ccd5 Move test stuff to tests 2018-01-18 11:57:57 +00:00
Erik Johnston
2f18a2647b Make all fields private 2018-01-18 11:57:54 +00:00
Erik Johnston
dc519602ac Ensure we registerProducer isn't called twice 2018-01-18 11:07:17 +00:00
Erik Johnston
17b54389fe Fix _notify_empty typo 2018-01-18 11:05:34 +00:00
Erik Johnston
28b338ed9b Move definition of paused_producer to __init__ 2018-01-18 11:04:41 +00:00
Erik Johnston
a177325b49 Fix comments 2018-01-18 11:02:43 +00:00
Erik Johnston
bc67e7d260 Add decent impl of a FileConsumer
Twisted core doesn't have a general purpose one, so we need to write one
ourselves.

Features:
- All writing happens in background thread
- Supports both push and pull producers
- Push producers get paused if the consumer falls behind
2018-01-17 16:43:03 +00:00
Richard van der Hoff
3d12d97415 Track DB scheduling delay per-request
For each request, track the amount of time spent waiting for a db
connection. This entails adding it to the LoggingContext and we may as well add
metrics for it while we are passing.
2018-01-16 17:23:32 +00:00
Richard van der Hoff
6324b65f08 Track db txn time in millisecs
... to reduce the amount of floating-point foo we do.
2018-01-16 15:53:18 +00:00
Richard van der Hoff
44a498418c Optimise LoggingContext creation and copying
It turns out that the only thing we use the __dict__ of LoggingContext for is
`request`, and given we create lots of LoggingContexts and then copy them every
time we do a db transaction or log line, using the __dict__ seems a bit
redundant. Let's try to optimise things by making the request attribute
explicit.
2018-01-16 15:49:42 +00:00
Richard van der Hoff
39f4e29d01 Reorganise request and block metrics
In order to circumvent the number of duplicate foo:count metrics increasing
without bounds, it's time for a rearrangement.

The following are all deprecated, and replaced with synapse_util_metrics_block_count:
  synapse_util_metrics_block_timer:count
  synapse_util_metrics_block_ru_utime:count
  synapse_util_metrics_block_ru_stime:count
  synapse_util_metrics_block_db_txn_count:count
  synapse_util_metrics_block_db_txn_duration:count

The following are all deprecated, and replaced with synapse_http_server_response_count:
   synapse_http_server_requests
   synapse_http_server_response_time:count
   synapse_http_server_response_ru_utime:count
   synapse_http_server_response_ru_stime:count
   synapse_http_server_response_db_txn_count:count
   synapse_http_server_response_db_txn_duration:count

The following are renamed (the old metrics are kept for now, but deprecated):

  synapse_util_metrics_block_timer:total ->
     synapse_util_metrics_block_time_seconds

  synapse_util_metrics_block_ru_utime:total ->
     synapse_util_metrics_block_ru_utime_seconds

  synapse_util_metrics_block_ru_stime:total ->
     synapse_util_metrics_block_ru_stime_seconds

  synapse_util_metrics_block_db_txn_count:total ->
     synapse_util_metrics_block_db_txn_count

  synapse_util_metrics_block_db_txn_duration:total ->
     synapse_util_metrics_block_db_txn_duration_seconds

  synapse_http_server_response_time:total ->
     synapse_http_server_response_time_seconds

  synapse_http_server_response_ru_utime:total ->
     synapse_http_server_response_ru_utime_seconds

  synapse_http_server_response_ru_stime:total ->
     synapse_http_server_response_ru_stime_seconds

   synapse_http_server_response_db_txn_count:total ->
      synapse_http_server_response_db_txn_count

   synapse_http_server_response_db_txn_duration:total
      synapse_http_server_response_db_txn_duration_seconds
2018-01-15 17:09:44 +00:00
Richard van der Hoff
b2cd6accf5 Remove __PreservingContextDeferred too 2017-11-14 23:00:10 +00:00