Erik Johnston
1058d14127
Make the in flight background process metrics thread safe
2018-08-20 17:27:24 +01:00
Richard van der Hoff
bab94da79c
fix metric name
2018-08-07 22:11:45 +01:00
Richard van der Hoff
53bca4690b
more metrics for the federation and appservice senders
2018-08-07 19:09:48 +01:00
Richard van der Hoff
03751a6420
Fix some looping_call calls which were broken in #3604
...
It turns out that looping_call does check the deferred returned by its
callback, and (at least in the case of client_ips), we were relying on this,
and I broke it in #3604 .
Update run_as_background_process to return the deferred, and make sure we
return it to clock.looping_call.
2018-07-26 11:48:08 +01:00
Richard van der Hoff
6e3fc657b4
Resource tracking for background processes
...
This introduces a mechanism for tracking resource usage by background
processes, along with an example of how it will be used.
This will help address #3518 , but more importantly will give us better insights
into things which are happening but not being shown up by the request metrics.
We *could* do this with Measure blocks, but:
- I think having them pulled out as a completely separate metric class will
make it easier to distinguish top-level processes from those which are
nested.
- I want to be able to report on in-flight background processes, and I don't
think we want to do this for *all* Measure blocks.
2018-07-18 10:50:33 +01:00
Amber Brown
49af402019
run isort
2018-07-09 16:09:20 +10:00
Amber Brown
6350bf925e
Attempt to be more performant on PyPy ( #3462 )
2018-06-28 14:49:57 +01:00
Richard van der Hoff
cbbfaa4be8
Fix description of "python_gc_time" metric
2018-06-21 10:02:42 +01:00
Matthew Hodgson
ccfdaf68be
spell gauge correctly
2018-06-16 07:10:34 +01:00
Amber Brown
f116f32ace
add a last seen metric ( #3396 )
2018-06-14 20:26:59 +10:00
Richard van der Hoff
694968fa81
Hopefully, fix LaterGuage error handling
2018-06-04 15:59:14 +01:00
Amber Brown
febe0ec8fd
Run Prometheus on a different port, optionally. ( #3274 )
2018-05-31 19:04:50 +10:00
Matthew Hodgson
ff1bc0a279
pep8
2018-05-29 02:32:15 +01:00
Matthew Hodgson
0a240ad36e
disable CPUMetrics if no /proc/self/stat
...
fixes build on macOS again
2018-05-29 02:23:30 +01:00
Amber Brown
5c40ce3777
invalid syntax :(
2018-05-28 19:16:09 +10:00
Amber Brown
a2eb5db4a0
update metrics to be in seconds
2018-05-28 19:10:27 +10:00
Amber Brown
389dac2c15
pepeightttt
2018-05-23 13:08:59 -05:00
Amber Brown
472a5ec4e2
add back CPU metrics
2018-05-23 13:03:56 -05:00
Amber Brown
b6063631c3
more cleanup
2018-05-22 17:36:20 -05:00
Amber Brown
53cc2cde1f
cleanup
2018-05-22 17:32:57 -05:00
Amber Brown
85ba83eb51
fixes
2018-05-22 16:28:23 -05:00
Amber Brown
a8990fa2ec
Merge remote-tracking branch 'origin/develop' into 3218-official-prom
2018-05-22 10:50:26 -05:00
Amber Brown
df9f72d9e5
replacing portions
2018-05-21 19:47:37 -05:00
Amber Brown
c60e0d5e02
don't need the resource portion
2018-05-21 17:03:20 -05:00
Amber Brown
f258deffcb
remove old metrics libs
2018-05-21 17:01:15 -05:00
Erik Johnston
6d8ec3462d
Note that label values can be anything
2018-05-03 16:25:05 +01:00
Erik Johnston
95b6912045
Fix metrics that have integer value labels
2018-05-03 15:51:04 +01:00
Erik Johnston
a41117c63b
Make _escape_character take MatchObject
2018-05-02 17:27:27 +01:00
Erik Johnston
32015e1109
Escape label values in prometheus metrics
2018-05-02 16:52:42 +01:00
Erik Johnston
d7bf3a68f0
s/list/tuple
2018-04-12 11:19:04 +01:00
Erik Johnston
4dae4a97ed
Track last processed event received_ts
2018-04-11 14:27:09 +01:00
Erik Johnston
92e34615c5
Track where event stream processing have gotten up to
2018-04-11 12:13:40 +01:00
Erik Johnston
ab825aa328
Add GaugeMetric
2018-04-11 12:13:40 +01:00
Vincent Breitmoser
6d7f0f8dd3
Don't disable GC when running on PyPy
...
PyPy's incminimark GC can't be triggered manually. From what I observed
there are no obvious issues with just letting it run normally. And
unlike CPython, it actually returns unused RAM to the system.
Signed-off-by: Vincent Breitmoser <look@my.amazin.horse>
2018-04-10 11:35:34 +02:00
Richard van der Hoff
88541f9009
Add a metric which increments when a request is received
...
It's useful to know when there are peaks in incoming requests - which isn't
quite the same as there being peaks in outgoing responses, due to the time
taken to handle requests.
2018-03-09 16:30:26 +00:00
Richard van der Hoff
bc496df192
report metrics on number of cache evictions
2018-02-05 15:34:01 +00:00
Richard van der Hoff
87b7d72760
Add some comments about the reactor tick time metric
2018-01-19 23:51:04 +00:00
Richard van der Hoff
ce236f8ac8
better exception logging in callbackmetrics
...
when we fail to render a metric, give a clue as to which metric it was
2018-01-18 11:30:49 +00:00
Richard van der Hoff
992018d1c0
mechanism to render metrics with alternative names
2018-01-15 17:04:39 +00:00
Richard van der Hoff
80fa610f9c
Add some comments to metrics classes
2018-01-15 16:52:52 +00:00
Richard van der Hoff
19d274085f
Make Counter render floats
...
Prometheus handles all metrics as floats, and sometimes we store non-integer
values in them (notably, durations in seconds), so let's render them as floats
too.
(Note that the standard client libraries also treat Counters as floats.)
2018-01-12 23:49:44 +00:00
Paul "LeoNerd" Evans
2938a00825
Rename the python-specific metrics now the docs claim that we have done
2016-11-03 17:03:52 +00:00
Paul "LeoNerd" Evans
5219f7e060
Since we don't export per-filetype fd counts any more, delete all the code related to that too
2016-11-03 16:41:32 +00:00
Paul "LeoNerd" Evans
93ebeb2aa8
Remove now-unused 'resource' import
2016-11-03 16:37:09 +00:00
Paul "LeoNerd" Evans
c1b077cd19
Now we have new-style metrics don't bother exporting legacy-named process ones
2016-11-03 16:34:16 +00:00
Paul "LeoNerd" Evans
1cc22da600
Set up the process collector during metrics __init__; that way all split-process workers have it
2016-10-27 18:09:34 +01:00
Paul "LeoNerd" Evans
aac13b1f9a
Pass the Metrics group into the process collector instead of having it find its own one; this avoids it needing to import from synapse.metrics
2016-10-27 18:08:15 +01:00
Paul "LeoNerd" Evans
ccc1a3d54d
Allow creation of a 'subspace' within a Metrics object, returning another one
2016-10-27 18:07:34 +01:00
Paul "LeoNerd" Evans
b01aaadd48
Split callback metric lambda functions down onto their own lines to keep line lengths under 90
2016-10-19 18:26:13 +01:00
Paul "LeoNerd" Evans
1071c7d963
Adjust code for <100 char line limit
2016-10-19 18:23:25 +01:00
Paul "LeoNerd" Evans
6453d03edd
Cut the raw /proc/self/stat line up into named fields at collection time
2016-10-19 18:21:40 +01:00
Paul "LeoNerd" Evans
3ae48a1f99
Move the process metrics collector code into its own file
2016-10-19 18:10:24 +01:00
Paul "LeoNerd" Evans
4cedd53224
A slightly neater way to manage metric collector functions
2016-10-19 17:54:09 +01:00
Paul "LeoNerd" Evans
5663137e03
appease pep8
2016-10-19 16:09:42 +01:00
Paul "LeoNerd" Evans
b202531be6
Also guard /proc/self/fds-related code with a suitable psuedoconstant
2016-10-19 15:37:41 +01:00
Paul "LeoNerd" Evans
1b179455fc
Guard registration of process-wide metrics by existence of the requisite /proc entries
2016-10-19 15:34:38 +01:00
Paul "LeoNerd" Evans
981f852d54
Add standard process_start_time_seconds metric
2016-10-19 15:05:22 +01:00
Paul "LeoNerd" Evans
def63649df
Add standard process_max_fds metric
2016-10-19 15:05:21 +01:00
Paul "LeoNerd" Evans
06f1ad1625
Add standard process_open_fds metric
2016-10-19 15:05:21 +01:00
Paul "LeoNerd" Evans
95fc70216d
Add standard process_*_memory_bytes metrics
2016-10-19 15:05:21 +01:00
Paul "LeoNerd" Evans
9b0316c75a
Use /proc/self/stat to generate the new process_cpu_*_seconds_total metrics
2016-10-19 15:05:21 +01:00
Paul "LeoNerd" Evans
03c2720940
Export CPU usage metrics also under prometheus-standard metric name
2016-10-19 15:05:21 +01:00
Paul "LeoNerd" Evans
b21b9dbc37
Callback metric values might not just be integers - allow floats
2016-10-19 15:05:15 +01:00
Erik Johnston
7c1a92274c
Make psutil optional
2016-08-08 11:12:21 +01:00
Erik Johnston
d36b1d849d
Don't explode if we have no snapshots yet
2016-07-20 16:59:52 +01:00
Erik Johnston
66868119dc
Add metrics for psutil derived memory usage
2016-07-20 16:00:21 +01:00
Erik Johnston
0f2165ccf4
Don't track total objects as its too expensive to calculate
2016-06-07 17:00:45 +01:00
Erik Johnston
18f0cc7d99
Record some more GC metrics
2016-06-07 16:55:49 +01:00
Erik Johnston
48e65099b5
Also record number of unreachable objects
2016-06-07 13:40:22 +01:00
Erik Johnston
75331c5fca
Change the way we do stats
2016-06-07 13:33:13 +01:00
Erik Johnston
8c966fbd51
Merge pull request #771 from matrix-org/erikj/gc_tick
...
Manually run GC on reactor tick.
2016-06-07 13:18:36 +01:00
Erik Johnston
73c7112433
Change CacheMetrics to be quicker
...
We change it so that each cache has an individual CacheMetric, instead
of having one global CacheMetric. This means that when a cache tries to
increment a counter it does not need to go through so many indirections.
2016-06-03 11:26:52 +01:00
Erik Johnston
60d53f9e95
Count number of GC collects
2016-05-16 09:34:42 +01:00
Erik Johnston
7d6e89ed22
Add a comment
2016-05-13 16:31:08 +01:00
Erik Johnston
1f1dee94f6
Manually run GC on reactor tick.
...
This also adds a metric for amount of time spent in GC.
2016-05-09 10:13:25 +01:00
Matthew Hodgson
6c28ac260c
copyrights
2016-01-07 04:26:29 +00:00
Mark Haines
709ba99afd
Check that /proc/self/fd exists before listing it
2015-09-07 16:45:55 +01:00
Mark Haines
9e4dacd5e7
The maxrss reported by getrusage is in kilobytes, not pages
2015-09-07 16:45:48 +01:00
Erik Johnston
6e7d36a72c
Also check for presence of 'threadCallQueue' in reactor
2015-08-18 11:51:08 +01:00
Erik Johnston
d3da63f766
Use more helpful variable names
2015-08-18 11:47:00 +01:00
Erik Johnston
891dfd90bd
Fix pending_calls metric to not lie
2015-08-14 15:43:11 +01:00
Erik Johnston
a6c27de1aa
Don't time getDelayedCalls
2015-08-13 11:41:57 +01:00
Erik Johnston
ba5d34a832
Add some metrics about the reactor
2015-08-13 11:38:59 +01:00
Paul "LeoNerd" Evans
ef1e019840
Appease pep8
2015-04-01 19:17:38 +01:00
Paul "LeoNerd" Evans
5583e29513
Report process open filehandles in metrics
2015-04-01 19:15:23 +01:00
Paul "LeoNerd" Evans
05a056a409
Appease pyflakes
2015-03-12 16:45:05 +00:00
Paul "LeoNerd" Evans
0eb7e6b9a8
Delete unused import of NOT_READY_YET
2015-03-12 16:39:52 +00:00
Paul "LeoNerd" Evans
128cf2daf7
Appease pep8
2015-03-12 16:24:51 +00:00
Paul "LeoNerd" Evans
2e4f0b2bd7
Replace the @metrics.counted annotations in federation with specifically-written counters and distributions
2015-03-12 16:24:51 +00:00
Paul "LeoNerd" Evans
c1cdd7954d
Add an .inc_by() method to CounterMetric; implement DistributionMetric a neater way
2015-03-12 16:24:51 +00:00
Paul "LeoNerd" Evans
493e3fa0ca
Don't forbid '_' in metric basenames any more, to allow things like foo_time
2015-03-12 16:24:51 +00:00
Paul "LeoNerd" Evans
f1fbe3e09f
Rename TimerMetric to DistributionMetric; as it could count more than just time
2015-03-12 16:24:51 +00:00
Paul "LeoNerd" Evans
cbc0406be8
Export CacheMetric as hits+total, rather than hits+misses, as it's easier to derive hit ratio from that
2015-03-12 16:24:51 +00:00
Paul "LeoNerd" Evans
4d661ec0f3
Remember to emit final linefeed from /metrics page, or Prometheus gets upset
2015-03-12 16:24:51 +00:00
Paul "LeoNerd" Evans
0e847540c3
Prometheus needs "escaped" label values
2015-03-12 16:24:51 +00:00
Paul "LeoNerd" Evans
22b37b75db
Kill unused CounterMetric.fetch() method
2015-03-12 16:24:51 +00:00
Paul "LeoNerd" Evans
b0cf867319
Use _ instead of . as a metric namespacing separator, for Prometheus
2015-03-12 16:24:51 +00:00
Paul "LeoNerd" Evans
0b96bb793e
Have all @metrics.counted use a single metric name vectored on the method name, rather than a brand new scalar counter per counted method
2015-03-12 16:24:51 +00:00
Paul "LeoNerd" Evans
b3a0179d64
Bugfix to rendering output of vectored TimerMetrics
2015-03-12 16:24:51 +00:00
Paul "LeoNerd" Evans
f9478e475b
Rename Metrics' "keys" to "labels"
2015-03-12 16:24:51 +00:00