Commit Graph

10043 Commits

Author SHA1 Message Date
Richard van der Hoff
a7e4ff9cca
Merge pull request #2795 from matrix-org/rav/track_db_scheduling
Track DB scheduling delay per-request
2018-01-17 14:29:37 +00:00
Richard van der Hoff
f884cfffb9
Merge pull request #2797 from matrix-org/rav/user_id_checking
Sanity checking for user ids
2018-01-17 14:29:12 +00:00
Richard van der Hoff
a5213df1f7 Sanity checking for user ids
Check the user_id passed to a couple of APIs for validity, to avoid
"IndexError: list index out of range" exception which looks scary and results
in a 500 rather than a more useful error.

Fixes #1432, among other things
2018-01-17 14:28:54 +00:00
Richard van der Hoff
e8f7541d3f Merge remote-tracking branch 'origin/develop' into rav/track_db_scheduling 2018-01-17 14:01:57 +00:00
Richard van der Hoff
fb6563b4be
Merge pull request #2793 from matrix-org/rav/db_txn_time_in_millis
Track db txn time in millisecs
2018-01-17 13:52:42 +00:00
Richard van der Hoff
1954e867b4
Merge pull request #2796 from matrix-org/rav/fix_closed_connection_errors
Fix 'NoneType' object has no attribute 'writeHeaders'
2018-01-17 13:52:24 +00:00
Richard van der Hoff
f23b4078c0
Merge pull request #2794 from matrix-org/rav/rework_run_interaction
rework runInteraction in terms of runConnection
2018-01-17 13:50:42 +00:00
Richard van der Hoff
11ab2f56f5 Merge branch 'develop' into rav/fix_closed_connection_errors 2018-01-17 11:25:11 +00:00
Richard van der Hoff
0486a7814a Merge branch 'develop' into rav/rework_run_interaction 2018-01-17 11:24:38 +00:00
Richard van der Hoff
90c14da992 Merge branch 'develop' into rav/db_txn_time_in_millis 2018-01-17 11:19:55 +00:00
Richard van der Hoff
1067b96364
Merge pull request #2792 from matrix-org/rav/optimise_logging_context
Optimise LoggingContext creation and copying
2018-01-17 11:19:28 +00:00
Richard van der Hoff
38506773eb Merge remote-tracking branch 'origin/develop' into rav/optimise_logging_context 2018-01-17 10:57:13 +00:00
Erik Johnston
300edc2348 Update last access time when thumbnails are viewed 2018-01-17 10:24:43 +00:00
Erik Johnston
05f98a2224 Keep track of last access time for local media 2018-01-17 10:24:43 +00:00
Erik Johnston
3cb2dabaad
Merge pull request #2789 from matrix-org/erikj/fix_thumbnails
Fix thumbnailing remote files
2018-01-17 10:22:20 +00:00
Erik Johnston
d728c47142 Add docstring 2018-01-17 10:06:14 +00:00
Richard van der Hoff
936482d507 Fix 'NoneType' object has no attribute 'writeHeaders'
Avoid throwing a (harmless) exception when we try to write an error response to
an http request where the client has disconnected.

This comes up as a CRITICAL error in the logs which tends to mislead people
into thinking there's an actual problem
2018-01-16 17:58:16 +00:00
Richard van der Hoff
3d12d97415 Track DB scheduling delay per-request
For each request, track the amount of time spent waiting for a db
connection. This entails adding it to the LoggingContext and we may as well add
metrics for it while we are passing.
2018-01-16 17:23:32 +00:00
Richard van der Hoff
0f5d2cc37c Merge branch 'rav/db_txn_time_in_millis' into rav/track_db_scheduling 2018-01-16 17:09:15 +00:00
Richard van der Hoff
8615f19d20 rework runInteraction in terms of runConnection
... so that we can share the code
2018-01-16 17:08:29 +00:00
Matthew Hodgson
5e97ca7ee6 fix typo 2018-01-16 16:52:35 +00:00
Erik Johnston
d863f68cab Use local vars 2018-01-16 16:24:15 +00:00
Erik Johnston
6368e5c0ab Change _generate_thumbnails to take media_type 2018-01-16 16:17:38 +00:00
Erik Johnston
0a90d9ede4 Move setting of file_id up to caller 2018-01-16 16:03:05 +00:00
Richard van der Hoff
6324b65f08 Track db txn time in millisecs
... to reduce the amount of floating-point foo we do.
2018-01-16 15:53:18 +00:00
Richard van der Hoff
44a498418c Optimise LoggingContext creation and copying
It turns out that the only thing we use the __dict__ of LoggingContext for is
`request`, and given we create lots of LoggingContexts and then copy them every
time we do a db transaction or log line, using the __dict__ seems a bit
redundant. Let's try to optimise things by making the request attribute
explicit.
2018-01-16 15:49:42 +00:00
Erik Johnston
5dfc83704b Fix typo 2018-01-16 14:32:56 +00:00
Richard van der Hoff
febdca4b37
Merge pull request #2785 from matrix-org/rav/reorganise_metrics_again
Reorganise request and block metrics
2018-01-16 14:10:21 +00:00
Richard van der Hoff
f5f89fda21
Merge pull request #2790 from matrix-org/rav/preserve_event_logcontext_leak
Fix a logcontext leak in persist_events
2018-01-16 14:09:13 +00:00
Erik Johnston
307f88dfb6 Fix up log lines 2018-01-16 13:53:52 +00:00
Richard van der Hoff
807e848f0f
Merge pull request #2787 from matrix-org/rav/worker_event_counts
Metrics for events processed in appservice and fed sender
2018-01-16 13:14:09 +00:00
Richard van der Hoff
4a31a61ef9
Merge pull request #2786 from matrix-org/rav/rdata_metrics
Metrics for number of RDATA commands received
2018-01-16 13:13:53 +00:00
Richard van der Hoff
ee7a1cabd8 document metrics changes 2018-01-16 13:04:01 +00:00
Erik Johnston
9795b9ebb1 Correctly use server_name/file_id when generating/fetching remote thumbnails 2018-01-16 12:02:06 +00:00
Erik Johnston
c5b589f2e8 Log when we respond with 404 2018-01-16 12:01:40 +00:00
Richard van der Hoff
64ddec1bc0 Fix a logcontext leak in persist_events
ObserveableDeferred expects its callbacks to be called without any
logcontexts, whereas it turns out we were calling them with the logcontext of
the request which initiated the persistence loop.

It seems wrong that we are attributing work done in the persistence loop to the
request that happened to initiate it, so let's solve this by dropping the
logcontext for it.

(I'm not sure this actually causes any real problems other than messages in the
debug log, but let's clean it up anyway)
2018-01-16 11:47:36 +00:00
Erik Johnston
a4c5e4a645 Fix thumbnailing remote files 2018-01-16 11:37:50 +00:00
Erik Johnston
1159abbdd2
Merge pull request #2767 from matrix-org/erikj/media_storage_refactor
Refactor MediaRepository to separate out storage
2018-01-16 10:23:50 +00:00
Richard van der Hoff
a027c2af8d Metrics for events processed in appservice and fed sender
More metrics I wished I'd had
2018-01-15 18:23:24 +00:00
Richard van der Hoff
5c3c32f16f Metrics for number of RDATA commands received
I found myself wishing we had this.
2018-01-15 17:45:55 +00:00
Richard van der Hoff
39f4e29d01 Reorganise request and block metrics
In order to circumvent the number of duplicate foo:count metrics increasing
without bounds, it's time for a rearrangement.

The following are all deprecated, and replaced with synapse_util_metrics_block_count:
  synapse_util_metrics_block_timer:count
  synapse_util_metrics_block_ru_utime:count
  synapse_util_metrics_block_ru_stime:count
  synapse_util_metrics_block_db_txn_count:count
  synapse_util_metrics_block_db_txn_duration:count

The following are all deprecated, and replaced with synapse_http_server_response_count:
   synapse_http_server_requests
   synapse_http_server_response_time:count
   synapse_http_server_response_ru_utime:count
   synapse_http_server_response_ru_stime:count
   synapse_http_server_response_db_txn_count:count
   synapse_http_server_response_db_txn_duration:count

The following are renamed (the old metrics are kept for now, but deprecated):

  synapse_util_metrics_block_timer:total ->
     synapse_util_metrics_block_time_seconds

  synapse_util_metrics_block_ru_utime:total ->
     synapse_util_metrics_block_ru_utime_seconds

  synapse_util_metrics_block_ru_stime:total ->
     synapse_util_metrics_block_ru_stime_seconds

  synapse_util_metrics_block_db_txn_count:total ->
     synapse_util_metrics_block_db_txn_count

  synapse_util_metrics_block_db_txn_duration:total ->
     synapse_util_metrics_block_db_txn_duration_seconds

  synapse_http_server_response_time:total ->
     synapse_http_server_response_time_seconds

  synapse_http_server_response_ru_utime:total ->
     synapse_http_server_response_ru_utime_seconds

  synapse_http_server_response_ru_stime:total ->
     synapse_http_server_response_ru_stime_seconds

   synapse_http_server_response_db_txn_count:total ->
      synapse_http_server_response_db_txn_count

   synapse_http_server_response_db_txn_duration:total
      synapse_http_server_response_db_txn_duration_seconds
2018-01-15 17:09:44 +00:00
Richard van der Hoff
992018d1c0 mechanism to render metrics with alternative names 2018-01-15 17:04:39 +00:00
Richard van der Hoff
80fa610f9c Add some comments to metrics classes 2018-01-15 16:52:52 +00:00
Richard van der Hoff
5e16c1dc8c
Merge pull request #2778 from matrix-org/rav/counters_should_be_floats
Make Counter render floats
2018-01-15 14:09:53 +00:00
Richard van der Hoff
19d274085f Make Counter render floats
Prometheus handles all metrics as floats, and sometimes we store non-integer
values in them (notably, durations in seconds), so let's render them as floats
too.

(Note that the standard client libraries also treat Counters as floats.)
2018-01-12 23:49:44 +00:00
Richard van der Hoff
0fc2362d37
Merge pull request #2777 from matrix-org/rav/fix_remote_thumbnails
Reinstate media download on thumbnail request
2018-01-12 19:12:31 +00:00
Richard van der Hoff
21bf87a146 Reinstate media download on thumbnail request
We need to actually download the remote media when we get a request for a
thumbnail.
2018-01-12 15:38:06 +00:00
Erik Johnston
694f1c1b18 Fix up comments 2018-01-12 15:02:46 +00:00
Erik Johnston
e21370ba54 Correctly reraise exception 2018-01-12 14:44:02 +00:00
Erik Johnston
85a4d78213 Make Responder a context manager 2018-01-12 13:32:03 +00:00