Commit Graph

135 Commits

Author SHA1 Message Date
Erik Johnston
b2aa05a8d6 Use efficient .intersection 2018-07-17 11:07:04 +01:00
Erik Johnston
547b1355d3 Fix perf regression in PR #3530
The get_entities_changed function was changed to return all changed
entities since the given stream position, rather than only those changed
from a given list of entities. This resulted in the function incorrectly
returning large numbers of entities that, for example, caused large
increases in database usage.
2018-07-17 10:27:51 +01:00
Erik Johnston
77b692e65d Don't return unknown entities in get_entities_changed
The stream cache keeps track of all entities that have changed since
a particular stream position, so get_entities_changed does not need to
return unknown entites when given a larger stream position.

This makes it consistent with the behaviour of has_entity_changed.
2018-07-13 15:26:10 +01:00
Richard van der Hoff
fa5c2bc082 Reduce set building in get_entities_changed
This line shows up as about 5% of cpu time on a synchrotron:

    not_known_entities = set(entities) - set(self._entity_to_key)

Presumably the problem here is that _entity_to_key can be largeish, and
building a set for its keys every time this function is called is slow.

Here we rewrite the logic to avoid building so many sets.
2018-07-12 11:37:44 +01:00
Amber Brown
49af402019 run isort 2018-07-09 16:09:20 +10:00
Amber Brown
72d2143ea8
Revert "Revert "Try to not use as much CPU in the StreamChangeCache"" (#3454) 2018-06-28 11:04:18 +01:00
Matthew Hodgson
8057489b26
Revert "Try to not use as much CPU in the StreamChangeCache" 2018-06-26 18:09:01 +01:00
Amber Brown
1202508067 fixes 2018-06-26 17:29:01 +01:00
Amber Brown
bd3d329c88 fixes 2018-06-26 17:28:12 +01:00
Amber Brown
abfe4b2957 try and make loading items from the cache faster 2018-06-26 17:25:34 +01:00
Richard van der Hoff
43e02c409d Disable partial state group caching for wildcard lookups
When _get_state_for_groups is given a wildcard filter, just do a complete
lookup. Hopefully this will give us the best of both worlds by not filling up
the ram if we only need one or two keys, but also making the cache still work
for the federation reader usecase.
2018-06-22 11:52:07 +01:00
Amber Brown
f7869f8f8b
Port to sortedcontainers (with tests!) (#3332) 2018-06-06 00:13:57 +10:00
Erik Johnston
042eedfa2b Add hacky cache factor override system 2018-06-04 15:39:28 +01:00
Amber Brown
c936a52a9e
Consistently use six's iteritems and wrap lazy keys/values in list() if they're not meant to be lazy (#3307) 2018-05-31 19:03:47 +10:00
Amber Brown
debff7ae09
Merge pull request #3281 from NotAFile/py3-six-isinstance
remaining isintance fixes
2018-05-30 12:44:46 +10:00
Amber Brown
357c74a50f add comment about why unreg 2018-05-28 19:14:41 +10:00
Amber Brown
754826a830 Merge remote-tracking branch 'origin/develop' into 3218-official-prom 2018-05-28 18:57:23 +10:00
Adrian Tschira
dd068ca979 remaining isintance fixes
Signed-off-by: Adrian Tschira <nota@notafile.com>
2018-05-24 20:55:08 +02:00
Amber Brown
071206304d cleanup pep8 errors 2018-05-22 16:54:22 -05:00
Amber Brown
85ba83eb51 fixes 2018-05-22 16:28:23 -05:00
Amber Brown
df9f72d9e5 replacing portions 2018-05-21 19:47:37 -05:00
Adrian Tschira
73cbdef5f7 fix py3 intern and remove unnecessary py3 encode
Signed-off-by: Adrian Tschira <nota@notafile.com>
2018-05-19 17:35:31 +02:00
Richard van der Hoff
11a67b7c9d
Merge pull request #3093 from matrix-org/rav/response_cache_wrap
Refactor ResponseCache usage
2018-04-20 11:31:17 +01:00
Richard van der Hoff
d3347ad485 Revert "Use sortedcontainers instead of blist"
This reverts commit 9fbe70a7dc.

It turns out that sortedcontainers.SortedDict is not an exact match for
blist.sorteddict; in particular, `popitem()` removes things from the opposite
end of the dict.

This is trivial to fix, but I want to add some unit tests, and potentially some
more thought about it, before we do so.
2018-04-13 11:16:43 +01:00
Richard van der Hoff
60f6014bb7 ResponseCache: fix handling of completed results
Turns out that ObservableDeferred.observe doesn't return a deferred if the
result is already completed. Fix handling and improve documentation.
2018-04-13 07:32:29 +01:00
Richard van der Hoff
b78395b7fe Refactor ResponseCache usage
Adds a `.wrap` method to ResponseCache which wraps up the boilerplate of a
(get, set) pair, and then use it throughout the codebase.

This will be largely non-functional, but does include the following functional
changes:

* federation_server.on_context_state_request: drops use of _server_linearizer
  which looked redundant and could cause incorrect cache misses by yielding
  between the get and the set.
* RoomListHandler.get_remote_public_room_list(): fixes logcontext leaks
* the wrap function includes some logging. I'm hoping this won't be too noisy
  on production.
2018-04-12 13:02:15 +01:00
Richard van der Hoff
d5c74b9f6c
Merge pull request #3092 from matrix-org/rav/response_cache_metrics
Add metrics for ResponseCache
2018-04-12 12:59:36 +01:00
Richard van der Hoff
261124396e
Merge pull request #3059 from matrix-org/rav/doc_response_cache
Document the behaviour of ResponseCache
2018-04-12 11:22:30 +01:00
Richard van der Hoff
b3384232a0 Add metrics for ResponseCache 2018-04-10 23:14:47 +01:00
Vincent Breitmoser
9fbe70a7dc Use sortedcontainers instead of blist
This commit drop-in replaces blist with SortedContainers. They are
written in pure python so work with pypy, but perform as good as
native implementations, at least in a couple benchmarks:

http://www.grantjenks.com/docs/sortedcontainers/performance.html
2018-04-10 11:29:51 +02:00
Richard van der Hoff
01afc563c3 Fix overzealous cache invalidation
Fixes an issue where a cache invalidation would invalidate *all* pending
entries, rather than just the entry that we intended to invalidate.
2018-04-05 16:24:04 +01:00
Richard van der Hoff
a9a74101a4 Document the behaviour of ResponseCache
it looks like everything that uses ResponseCache expects to have to
`make_deferred_yieldable` its results. It's debatable whether that is the best
approach, but let's document it for now to avoid further confusion.
2018-04-04 09:06:22 +01:00
Erik Johnston
9a0d783c11 Add comments 2018-03-19 11:35:53 +00:00
Erik Johnston
7c7706f42b Fix bug where state cache used lots of memory
The state cache bases its size on the sum of the size of entries. The
size of the entry is calculated once on insertion, so it is important
that the size of entries does not change.

The DictionaryCache modified the entries size, which caused the state
cache to incorrectly think it was smaller than it actually was.
2018-03-15 15:46:54 +00:00
Richard van der Hoff
bc496df192 report metrics on number of cache evictions 2018-02-05 15:34:01 +00:00
Erik Johnston
495f075b41 Increase default cache factor size. 2017-07-04 09:58:32 +01:00
Erik Johnston
b5e8d529e6 Define CACHE_SIZE_FACTOR once 2017-07-04 09:56:44 +01:00
Erik Johnston
c72058bcc6 Use an ExpiringCache for storing registration sessions
This is because pruning them was a significant performance drain on
matrix.org
2017-06-29 14:08:37 +01:00
Erik Johnston
efc2b7db95 Rewrite conditional 2017-06-09 13:35:15 +01:00
Erik Johnston
eed59dcc1e Fix has_any_entity_changed
Occaisonally has_any_entity_changed would throw the error: "Set changed
size during iteration" when taking the max of the `sorteddict`. While
its uncertain how that happens, its quite inefficient to iterate over
the entire dict anyway so we change to using the more traditional
`bisect_*` functions.
2017-06-09 11:44:01 +01:00
Erik Johnston
304880d185 Add stream change cache 2017-05-31 15:46:36 +01:00
Erik Johnston
bd7bb5df71 Pull out if statement from for loop 2017-05-22 15:12:19 +01:00
Erik Johnston
e3417a06e2 Update list cache to handle one arg case
We update the normal cache descriptors to handle caches with a single
argument specially so that the key wasn't a 1-tuple. We need to update
the cache list to be aware of this.
2017-05-22 15:04:42 +01:00
Erik Johnston
bbfe4e996c Make get_state_groups_from_groups faster.
Most of the time was spent copying a dict to filter out sentinel values
that indicated that keys did not exist in the dict. The sentinel values
were added to ensure that we cached the non-existence of keys.

By updating DictionaryCache to keep track of which keys were known to
not exist itself we can remove a dictionary copy.
2017-05-17 15:12:15 +01:00
Erik Johnston
ffad4fe35b Don't update event cache hit ratio from get_joined_users
Otherwise the hit ration of plain get_events gets completely skewed by
calls to get_joined_users* functions.
2017-05-08 16:06:17 +01:00
Erik Johnston
d2d8ed4884 Optimise caches with single key 2017-05-04 14:18:46 +01:00
Erik Johnston
efab1dadde Remove DEBUG_CACHES 2017-04-25 10:54:09 +01:00
Erik Johnston
119cb9bbcf Reduce cache size by not storing deferreds
Currently the cache descriptors store deferreds rather than raw values,
this is a simple way of triggering only one database hit and sharing the
result if two callers attempt to get the same value.

However, there are a few caches that simply store a mapping from string
to string (or int). These caches can have a large number of entries,
under the assumption that each entry is small. However, the size of a
deferred (specifically the size of ObservableDeferred) is signigicantly
larger than that of the raw value, 2kb vs 32b.

This PR therefore changes the cache descriptors to store the raw values
rather than the deferreds.

As a side effect cached storage function now either return a deferred or
the actual value, as the cached list decriptor already does. This is
fine as we always end up just yield'ing on the returned value
eventually, which handles that case correctly.
2017-04-25 10:23:11 +01:00
Erik Johnston
d134d0935e Only intern ascii strings 2017-04-24 14:07:48 +01:00
Erik Johnston
4d17add8de Remove unused instance variable 2017-03-31 09:38:27 +01:00