40 Commits

Author SHA1 Message Date
Erik Johnston
cf82338930
Merge pull request #4627 from matrix-org/erikj/user_ips_analyze
Analyze user_ips before running deduplication
2019-02-12 13:05:09 +00:00
Erik Johnston
495ea92350 Fix pep8 2019-02-12 12:40:42 +00:00
Erik Johnston
483ba85c7a Analyze user_ips before running deduplication
Due to the table locks taken out by the naive upsert, the table
statistics may be out of date. During deduplication it is important that
the correct index is used as otherwise a full table scan may be
incorrectly used, which can end up thrashing the database badly.
2019-02-12 11:55:27 +00:00
Erik Johnston
362d80b770 Reduce user_ips bloat during dedupe background update
The background update to remove duplicate rows naively deleted and
reinserted the duplicates. For large tables with a large number of
duplicates this causes a lot of bloat (with postgres), as the inserted
rows are appended to the table, since deleted rows will not be
overwritten until a VACUUM has happened.

This should hopefully also help ensure that the query in the last batch
uses the correct index, as inserting a large number of new rows without
analyzing will upset the query planner.
2019-02-12 11:39:34 +00:00
Amber Brown
58f6c48183
Use native UPSERTs where possible (#4306) 2019-01-24 21:31:54 +11:00
Erik Johnston
90743c9d89 Fixup removal of duplicate user_ips rows (#4432)
* Remove unnecessary ORDER BY clause

* Add logging

* Newsfile
2019-01-23 19:45:18 +11:00
Erik Johnston
7f503f83b9 Refactor to rewrite the SQL instead 2019-01-22 16:31:05 +00:00
Erik Johnston
1c9704f8ab Don't shadow params 2019-01-22 16:20:33 +00:00
Erik Johnston
2557531f0f Fix bug when removing duplicate rows from user_ips
This was caused by accidentally overwritting a `last_seen` variable
in a for loop, causing the wrong value to be written to the progress
table. The result of which was that we didn't scan sections of the table
when searching for duplicates, and so some duplicates did not get
deleted.
2019-01-22 13:33:46 +00:00
Amber Brown
a35c66a00b
Remove duplicates in the user_ips table and add an index (#4370) 2019-01-12 06:21:50 +11:00
Amber Brown
1f3f5fcf52
Fix client IPs being broken on Python 3 (#3908) 2018-09-20 20:14:34 +10:00
Amber Brown
99dd975dae
Run tests under PostgreSQL (#3423) 2018-08-13 16:47:46 +10:00
Neil Johnson
e10830e976 wip commit - tests failing 2018-08-03 17:55:50 +01:00
Neil Johnson
74b1d46ad9 do mau checks based on monthly_active_users table 2018-08-02 16:57:35 +01:00
Neil Johnson
00f99f74b1 insertion into monthly_active_users 2018-08-02 13:47:19 +01:00
Richard van der Hoff
03751a6420 Fix some looping_call calls which were broken in #3604
It turns out that looping_call does check the deferred returned by its
callback, and (at least in the case of client_ips), we were relying on this,
and I broke it in #3604.

Update run_as_background_process to return the deferred, and make sure we
return it to clock.looping_call.
2018-07-26 11:48:08 +01:00
Richard van der Hoff
667fba68f3 Run things as background processes
This fixes #3518, and ensures that we get useful logs and metrics for lots of
things that happen in the background.

(There are certainly more things that happen in the background; these are just
the common ones I've found running a single-process synapse locally).
2018-07-18 20:55:05 +01:00
Amber Brown
49af402019 run isort 2018-07-09 16:09:20 +10:00
Amber Brown
77ac14b960
Pass around the reactor explicitly (#3385) 2018-06-22 09:37:10 +01:00
Adrian Tschira
933bf2dd35 replace some iteritems with six
Signed-off-by: Adrian Tschira <nota@notafile.com>
2018-05-19 17:59:26 +02:00
Neil Johnson
617bf40924 Generate user daily stats 2018-04-25 17:37:29 +01:00
Neil Johnson
788e69098c Add user_ips last seen index 2018-03-28 12:03:13 +01:00
Richard van der Hoff
6cfee09be9 Make __init__ consitstent across Store heirarchy
Add db_conn parameters to the `__init__` methods of the *Store classes, so that
they are all consistent, which makes the multiple inheritance work correctly
(and so that we can later extract mixins which can be used in the slavedstores)
2017-11-13 10:46:07 +00:00
Erik Johnston
ed9a7f5436 Merge pull request #2309 from matrix-org/erikj/user_ip_repl
Fix up user_ip replication commands
2017-07-06 14:33:14 +01:00
Erik Johnston
b5e8d529e6 Define CACHE_SIZE_FACTOR once 2017-07-04 09:56:44 +01:00
Erik Johnston
8c23221666 Fix up 2017-06-27 15:53:45 +01:00
Erik Johnston
a0a561ae85 Fix up client ips to read from pending data 2017-06-27 14:46:12 +01:00
Erik Johnston
ed3d0170d9 Batch upsert user ips 2017-06-27 13:37:04 +01:00
Erik Johnston
6f83c4537c Increase size of IP cache 2017-06-07 10:18:44 +01:00
Erik Johnston
dcabef952c Increase client_ip cache size 2017-05-08 15:09:19 +01:00
Erik Johnston
85657eedf8 Bail on where clause instead 2017-04-11 16:24:31 +01:00
Erik Johnston
b48045a8f5 Don't bother with outer check for now 2017-04-11 16:23:24 +01:00
Erik Johnston
34840cdcef Fix getting latest device IP for user with no devices 2017-04-11 09:56:54 +01:00
Richard van der Hoff
363786845b PEP8 2016-07-22 13:21:07 +01:00
Richard van der Hoff
ec5717caf5 Create index on user_ips in the background
user_ips is kinda big, so really we want to add the index in the background
once we're running. Replace the schema delta with one which will do that.

I've done this in a way that's reasonably easy to reuse as there a few other
indexes I need, and I don't suppose they will be the last.
2016-07-22 13:16:39 +01:00
Richard van der Hoff
c445f5fec7 storage/client_ips: remove some dead code 2016-07-21 11:58:47 +01:00
Richard van der Hoff
7314bf4682 Merge branch 'develop' into rav/get_devices_api
(pick up PR #938 in the hope of fixing the UTs)
2016-07-20 17:40:00 +01:00
Richard van der Hoff
bc8f265f0a GET /devices endpoint
implement a GET /devices endpoint which lists all of the user's devices.

It also returns the last IP where we saw that device, so there is some dancing
to fish that out of the user_ips table.
2016-07-20 16:42:32 +01:00
Richard van der Hoff
ec041b335e Record device_id in client_ips
Record the device_id when we add a client ip; it's somewhat redundant as we
could get it via the access_token, but it will make querying rather easier.
2016-07-20 16:41:03 +01:00
Mark Haines
eef541a291 Move insert_client_ip to a separate class 2016-06-03 14:42:35 +01:00