Commit Graph

49 Commits

Author SHA1 Message Date
James Salter
d902181de9
Unified search query syntax using the full-text search capabilities of the underlying DB. ()
Support a unified search query syntax which leverages more of the full-text
search of each database supported by Synapse.

Supports, with the same syntax across Postgresql 11+ and Sqlite:

- quoted "search terms"
- `AND`, `OR`, `-` (negation) operators
- Matching words based on their stem, e.g. searches for "dog" matches
  documents containing "dogs". 

This is achieved by 

- If on postgresql 11+, pass the user input to `websearch_to_tsquery`
- If on sqlite, manually parse the query and transform it into the sqlite-specific
  query syntax.

Note that postgresql 10, which is close to end-of-life, falls back to using
`phraseto_tsquery`, which only supports a subset of the features.

Multiple terms separated by a space are implicitly ANDed.

Note that:

1. There is no escaping of full-text syntax that might be supported by the database;
  e.g. `NOT`, `NEAR`, `*` in sqlite. This runs the risk that people might discover this
  as accidental functionality and depend on something we don't guarantee.
2. English text is assumed for stemming. To support other languages, either the target
  language needs to be known at the time of indexing the message (via room metadata,
  or otherwise), or a separate index for each language supported could be created.

Sqlite docs: https://www.sqlite.org/fts3.html#full_text_index_queries
Postgres docs: https://www.postgresql.org/docs/11/textsearch-controls.html
2022-10-25 14:05:22 -04:00
David Robertson
0a38c7ec6d
Snapshot schema 72 ()
Including another batch of fixes to the schema dump script
2022-09-26 18:28:32 +01:00
David Robertson
f2d2481e56
Require SQLite >= 3.27.0 () 2022-09-09 11:14:10 +01:00
David Robertson
586bfc6dc0
Use dummy fallback engines if imports fail () 2022-06-07 17:33:55 +01:00
David Robertson
1fe202a1a3
Tidy up and type-hint the database engine modules ()
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
2022-05-17 00:34:38 +01:00
Shay
e78d4f61fc
Refuse to start if DB has an unsafe locale () 2022-03-23 10:23:05 -07:00
Nick Barrett
b59d285f7c
Db txn set isolation level ()
Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
2022-01-25 15:14:46 +01:00
Shay
9006ee36d1
Drop support for and remove references to EOL Python 3.6 ()
* remove reference in comments to python3.6

* upgrade tox python env in script

* bump python version in example for completeness

* upgrade python version requirement in setup doc

* upgrade necessary python version in __init__.py

* upgrade python version in setup.py

* newsfragment

* drops refs to bionic and replace with focal

* bump refs to postgres 9.6 to 10

* fix hanging ci

* try installing tzdata first

* revert change made in b979f336

* ignore new random mypy error while debugging other error

* fix lint error for temporary workaround

* revert change to install list

* try passing env var

* export debian frontend var?

* move line and add comment

* bump pillow dependency

* bump lxml depenency

* install libjpeg-dev for pillow

* bump automat version to one compatible with py3.8

* add libwebp for pillow

* bump twisted trunk python version

* change suffix of newsfragment

* remove redundant python 3.7 checks

* lint
2022-01-21 14:23:26 -08:00
Erik Johnston
329ef5c715
Fix the inbound PDU metric ()
This broke in 
2021-06-30 12:07:16 +01:00
Jonathan de Jong
4b965c862d
Remove redundant "coding: utf-8" lines ()
Part of 

Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now.

`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
2021-04-14 15:34:27 +01:00
Richard van der Hoff
3ada9b4264 Drop support for sqlite<3.22 as well 2021-04-08 16:42:32 +01:00
Richard van der Hoff
abade34633 Require py36 and Postgres 9.6 2021-04-08 16:42:32 +01:00
Eric Eastwood
0a00b7ff14
Update black, and run auto formatting over the codebase ()
- Update black version to the latest
 - Run black auto formatting over the codebase
    - Run autoformatting according to [`docs/code_style.md
`](80d6dc9783/docs/code_style.md)
 - Update `code_style.md` docs around installing black to use the correct version
2021-02-16 22:32:34 +00:00
Erik Johnston
ae5b2a72c0
Reduce serialization errors in MultiWriterIdGen ()
We call `_update_stream_positions_table_txn` a lot, which is an UPSERT
that can conflict in `REPEATABLE READ` isolation level. Instead of doing
a transaction consisting of a single query we may as well run it outside
of a transaction.
2020-10-07 15:15:57 +01:00
Richard van der Hoff
3c36ae17a5 Use SequenceGenerator for state group ID allocation 2020-07-16 11:25:08 +01:00
Richard van der Hoff
244dbb04f7
Fix incorrect error message when database CTYPE was set incorrectly. () 2020-07-01 13:56:16 +01:00
Richard van der Hoff
132b673dbe
Add some type annotations in synapse.storage ()
I cracked, and added some type definitions in synapse.storage.
2020-02-27 11:53:40 +00:00
Uday Bansal
7728d87fd7
Updated warning for incorrect database collation/ctype ()
Signed-off-by: Uday Bansal <43824981+udaybansal19@users.noreply.github.com>
2020-02-26 15:17:03 +00:00
Erik Johnston
02b44db922
Warn if postgres database has non-C locale. ()
As using non-C locale can cause issues on upgrading OS.
2020-01-28 13:44:21 +00:00
Richard van der Hoff
bf46821180 Refuse to start if sqlite is older than 3.11.0 2020-01-09 18:11:04 +00:00
Richard van der Hoff
e48ba84e0b Check postgres version in check_database
this saves doing it on each connection, and will allow us to pass extra options
in.
2020-01-09 18:05:59 +00:00
Richard van der Hoff
e97d1cf001 Modify check_database to take a connection rather than a cursor
We might not need the cursor at all.
2020-01-09 18:05:50 +00:00
Erik Johnston
83d86106a8
Merge pull request from matrix-org/erikj/postgres_any
Use Postgres ANY for selecting many values.
2019-10-10 16:41:36 +01:00
Erik Johnston
3bc687508f Remove add_in_list_sql_clause 2019-10-10 15:35:46 +01:00
Erik Johnston
1d3858371e Disable bytes usage with postgres
More often than not passing bytes to `txn.execute` is a bug (where we
meant to pass a string) that just happens to work if `BYTEA_OUTPUT` is
set to `ESCAPE`. However, this is a bit of a footgun so we want to
instead error when this happens, and force using `bytearray` if we
actually want to use bytes.
2019-10-08 16:28:57 +01:00
Erik Johnston
9267741a5f Fix devices_last_seen background update.
Fixes .
2019-09-30 11:58:36 +01:00
Amber Brown
eba7caf09f
Remove Postgres 9.4 support () 2019-06-18 00:59:00 +10:00
Amber Brown
7efd1d87c2 Run black on the rest of the storage module () 2019-04-03 10:07:29 +01:00
Richard van der Hoff
f191be822b
Add database version to phonehome stats. () 2019-02-27 10:21:49 +00:00
Amber Brown
58f6c48183
Use native UPSERTs where possible () 2019-01-24 21:31:54 +11:00
Amber Brown
14e4d4f4bf
Port storage/ to Python 3 () 2018-08-31 00:19:58 +10:00
Erik Johnston
3d33eef6fc
Store state groups separately from events ()
* Split state group persist into seperate storage func

* Add per database engine code for state group id gen

* Move store_state_group to StateReadStore

This allows other workers to use it, and so resolve state.

* Hook up store_state_group

* Fix tests

* Rename _store_mult_state_groups_txn

* Rename StateGroupReadStore

* Remove redundant _have_persisted_state_group_txn

* Update comments

* Comment compute_event_context

* Set start val for state_group_id_seq

... otherwise we try to recreate old state groups

* Update comments

* Don't store state for outliers

* Update comment

* Update docstring as state groups are ints
2018-02-06 14:31:24 +00:00
Mark Haines
d5fb561709 Optionally make committing to postgres asynchronous.
Useful when running tests when you don't care whether the server
will lose data that it claims that it has committed.
2016-06-20 17:53:38 +01:00
Erik Johnston
8aab9d87fa Don't require config to create database 2016-04-06 14:15:45 +01:00
Daniel Wagner-Hall
763360594d Mark AS users with their AS's ID 2016-02-11 17:26:42 +00:00
Matthew Hodgson
6c28ac260c copyrights 2016-01-07 04:26:29 +00:00
Erik Johnston
17c80c8a3d rename schema_prepare to prepare_database 2015-10-13 13:56:22 +01:00
Erik Johnston
ec398af41c Expose error more nicely 2015-10-13 11:43:43 +01:00
Erik Johnston
40b6a5aad1 Split out the schema preparation and update logic into its own module 2015-10-13 11:38:48 +01:00
Erik Johnston
1d566edb81 Remove race condition 2015-05-14 16:54:35 +01:00
Erik Johnston
1692dc019d Don't call 'encode_parameter' no-op 2015-05-05 15:00:30 +01:00
Erik Johnston
fabb7acd45 Fix bug where we reconnected to the database on every query. 2015-05-01 10:24:24 +01:00
Erik Johnston
cd0864121b Make postgres database error slightly more helpful 2015-04-29 12:12:25 +01:00
Erik Johnston
204132a998 Check that postgres database has correct charset set 2015-04-29 11:42:28 +01:00
Erik Johnston
2732be83d9 Shuffle operations so that locking upsert happens last in the txn. This ensures the lock is held for the least amount of time possible. 2015-04-27 13:22:30 +01:00
Erik Johnston
e4c4664d73 Handle the fact that postgres databases can be restarted from under us 2015-04-27 12:40:49 +01:00
Erik Johnston
b8092fbc82 Go back to storing JSON in TEXT 2015-04-16 11:17:52 +01:00
Erik Johnston
c756dfeb14 Correctly identify deadlocks 2015-04-15 10:23:42 +01:00
Erik Johnston
127fad17dd Add postgres database engine 2015-04-14 14:50:29 +01:00