From 9279a2c4e4c1c63e87d43fc9f6ad2c495bf47e67 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Fri, 3 Jan 2020 13:43:55 +0100 Subject: [PATCH 1/7] Add a complete documentation of the message retention policies support --- changelog.d/6623.doc | 1 + docs/message_retention_policies.md | 191 +++++++++++++++++++++++++++++ 2 files changed, 192 insertions(+) create mode 100644 changelog.d/6623.doc create mode 100644 docs/message_retention_policies.md diff --git a/changelog.d/6623.doc b/changelog.d/6623.doc new file mode 100644 index 000000000..c8aade097 --- /dev/null +++ b/changelog.d/6623.doc @@ -0,0 +1 @@ +Add a complete documentation of the message retention policies support. diff --git a/docs/message_retention_policies.md b/docs/message_retention_policies.md new file mode 100644 index 000000000..78055b2f6 --- /dev/null +++ b/docs/message_retention_policies.md @@ -0,0 +1,191 @@ +# Message retention policies + +Synapse admins can enable support for message retention policies on +their homeserver. Message retention policies exist at a room level, +follow the semantics described in +[MSC1763](https://github.com/matrix-org/matrix-doc/blob/matthew/msc1763/proposals/1763-configurable-retention-periods.md), +and allow server and room admins to configure how long messages should +be kept in a homeserver's database before being purged from it. + +A message retention policy is mainly defined by its `max_lifetime` +parameter, which defines how long a message can be kept around after +it's been sent in the room. If a room doesn't have a message retention +policy, and there's no default one for a given server, then no message +sent in that room is ever purged on that server. + +MSC1763 also specifies semantics for a `min_lifetime` parameter which +defines the amount of time after which an event _can_ get purged (after +it's been sent to the room), but Synapse doesn't currently support it +beyond registering it. + +Both `max_lifetime` and `min_lifetime` are optional parameters. + +Note that message retention policies don't apply to state events. + +Once an event reaches its expiry date (defined as the time it was sent +plus the value for `max_lifetime` in the room), two things happen: + +* Synapse stops serving the event to clients via any endpoint. +* The message gets picked up by the next purge job (see the "Purge jobs" + section) and is removed from Synapse's database. + +Since purge jobs don't run continuously, this means that an event might +stay in a server's database for longer than the value for `max_lifetime` +in the room would allow, though hidden from clients. + +Similarly, if a server (with support for message retention policies +enabled) receives from another server an event that should have been +purged according to its room's policy, then the receiving server will +process and store that event until it's picked up by the next purge job, +though it will always hide it from clients. + + +## Room configuration + +To configure a room's message retention policy, a room's admin or +moderator needs to send a state event in that room with the type +`m.room.retention` and the following content: + +```json +{ + "max_lifetime": ... +} +``` + +In this event's content, the `max_lifetime` parameter has the same +meaning as previously described, and needs to be expressed in +milliseconds. The event's content can also include a `min_lifetime` +parameter, which has the same meaning and limited support as previously +described. + +Note that over every server in the room, only the ones with support for +message retention policies will actually remove expired events. While +we plan to eventually enable this support by default in Synapse, this +isn't currently the case. + + +## Server configuration + +Support for this feature can be enabled and configured in the +`retention` section of the Synapse configuration file (see the +[sample file](https://github.com/matrix-org/synapse/blob/v1.7.3/docs/sample_config.yaml#L332-L393)). + +To enable support for message retentions policies, set the setting +`enabled` in this section to `true`. + + +### Default policy + +A default message retention policy is a policy defined in Synapse's +configuration that is used by Synapse for every room that doesn't have a +message retention policy configured in its state. This allows server +admins to ensure that messages are never kept indefinitely in a server's +database. + +A default policy can be defined as such, in the `retention` section of +the configuration file: + +```yaml + default_policy: + min_lifetime: 1d + max_lifetime: 1y +``` + +Here, `min_lifetime` and `max_lifetime` have the same meaning and level +of support as previously described. They can be expressed either as a +duration (using the units `s` (seconds), `m` (minutes), `h` (hours), +`d` (days), `w` (weeks) and `y` (years)) or as a number of milliseconds. + + +### Purge jobs + +Purge jobs are the jobs that Synapse run in the background to purge +expired events from the database. They are only run if support for +message retention policies is enabled in the server's configuration. If +no configuration for purge jobs is configured by the server admin, +Synapse will run one daily that will handle every room with a message +retention policy (or, if the server has a default policy configured, +every room it knows), which should be enough in most cases. + +Some server admins might want a finer control on when events are removed +depending on an event's room's policy. This can be done by setting the +`purge_jobs` sub-section in the `retention` section of the configuration +file. An example of such configuration could be: + +```yaml + purge_jobs: + - longest_max_lifetime: 3d + interval: 12h + - shortest_max_lifetime: 3d + longest_max_lifetime: 1w + interval: 1d + - shortest_max_lifetime: 1w + interval: 2d +``` + +In this example, we define two jobs: + +* one that runs twice a day (every 12 hours) and purges events in rooms + which policy's `max_lifetime` is lower or equal to 3 days. +* one that runs once a day and purges events in rooms which policy's + `max_lifetime` is between 3 days and a week. +* one that runs once every 2 days and purges events in rooms which + policy's `max_lifetime` is greater than a week. + +Note that this example is tailored to show different configurations and +features slightly more jobs than it's probably necessary (in practice, a +server admin would probably consider it better to replace the two last +jobs with one that runs once a day and handles rooms which which +policy's `max_lifetime` is greater than 3 days). + +Keep in mind, when configuring these jobs, that a purge job can become +quite heavy on the server if it targets many rooms, therefore prefer +having jobs with a low interval that target a limited set of rooms. Also +make sure to include a job with no minimum and one with no maximum to +make sure your configuration handles every policy. + +As previously mentioned in this documentation, while a purge job that +runs e.g. every day means that an expired event might stay in the +database for up to a day after its expiry, Synapse hides expired events +from clients as soon as they expire, so the event is not visible to +local users between its expiry date and the moment it gets purged from +the server's database. + + +### Lifetime limits + +**Note: this feature is mainly useful within a closed federation or on +servers that don't federate, because there currently is no way to +enforce these limits in an open federation.** + +Server admins can restrict the values their local users are allowed to +use for both `min_lifetime` and `max_lifetime`. These limits can be +defined as such in the `retention` section of the configuration file: + +```yaml + allowed_lifetime_min: 1d + allowed_lifetime_max: 1y +``` + +Here, `allowed_lifetime_min` is the lowest value a local user can set +for both `min_lifetime` and `max_lifetime`, and `allowed_lifetime_max` +is the highest value. Both parameters are optional (e.g. setting +`allowed_lifetime_min` but not `allowed_lifetime_max` only enforces a +minimum and no maximum). + +Like other settings in this section, these parameters can be expressed +either as a duration or as a number of milliseconds. + + +## Note on reclaiming disk space + +While purge jobs actually delete data from the database, the disk space +used by the database might not decrease immediately on the database's +host. However, even though the database engine won't free up the disk +space, it will start writing new data into where the purged data was. + +If you want to reclaim the freed disk space anyway and return it to the +operating system, the server admin needs to run `VACUUM FULL;` on the +database (see the related +[PostgreSQL documentation](https://www.postgresql.org/docs/current/sql-vacuum.html)). + From 51b8a21f0c3f52c26c63c196f5ed11b8be2394af Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Fri, 3 Jan 2020 13:49:12 +0100 Subject: [PATCH 2/7] Rename changelog --- changelog.d/{6623.doc => 6624.doc} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename changelog.d/{6623.doc => 6624.doc} (100%) diff --git a/changelog.d/6623.doc b/changelog.d/6624.doc similarity index 100% rename from changelog.d/6623.doc rename to changelog.d/6624.doc From b7dec300b7419402a0d5fc00e34684b95618a7d9 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Fri, 3 Jan 2020 13:51:59 +0100 Subject: [PATCH 3/7] Fix vacuum instructions for sqlite --- docs/message_retention_policies.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/message_retention_policies.md b/docs/message_retention_policies.md index 78055b2f6..72f08fbb4 100644 --- a/docs/message_retention_policies.md +++ b/docs/message_retention_policies.md @@ -185,7 +185,7 @@ host. However, even though the database engine won't free up the disk space, it will start writing new data into where the purged data was. If you want to reclaim the freed disk space anyway and return it to the -operating system, the server admin needs to run `VACUUM FULL;` on the -database (see the related +operating system, the server admin needs to run `VACUUM FULL;` (or +`VACUUM;` for SQLite databases) on Synapse's database (see the related [PostgreSQL documentation](https://www.postgresql.org/docs/current/sql-vacuum.html)). From 03edfc58500197fee40c808680551ea55d1560e8 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Tue, 7 Jan 2020 15:59:05 +0100 Subject: [PATCH 4/7] Update changelog.d/6624.doc Co-Authored-By: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> --- changelog.d/6624.doc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/changelog.d/6624.doc b/changelog.d/6624.doc index c8aade097..bc9a022db 100644 --- a/changelog.d/6624.doc +++ b/changelog.d/6624.doc @@ -1 +1 @@ -Add a complete documentation of the message retention policies support. +Add complete documentation of the message retention policies support. From 01fbd9573626381c51700845956c9c9451cb645a Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Tue, 7 Jan 2020 15:59:38 +0100 Subject: [PATCH 5/7] Apply suggestions from code review Co-Authored-By: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> --- docs/message_retention_policies.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/docs/message_retention_policies.md b/docs/message_retention_policies.md index 72f08fbb4..84f092523 100644 --- a/docs/message_retention_policies.md +++ b/docs/message_retention_policies.md @@ -9,13 +9,13 @@ be kept in a homeserver's database before being purged from it. A message retention policy is mainly defined by its `max_lifetime` parameter, which defines how long a message can be kept around after -it's been sent in the room. If a room doesn't have a message retention +it was sent to the room. If a room doesn't have a message retention policy, and there's no default one for a given server, then no message sent in that room is ever purged on that server. MSC1763 also specifies semantics for a `min_lifetime` parameter which defines the amount of time after which an event _can_ get purged (after -it's been sent to the room), but Synapse doesn't currently support it +it was sent to the room), but Synapse doesn't currently support it beyond registering it. Both `max_lifetime` and `min_lifetime` are optional parameters. @@ -70,7 +70,7 @@ Support for this feature can be enabled and configured in the `retention` section of the Synapse configuration file (see the [sample file](https://github.com/matrix-org/synapse/blob/v1.7.3/docs/sample_config.yaml#L332-L393)). -To enable support for message retentions policies, set the setting +To enable support for message retention policies, set the setting `enabled` in this section to `true`. @@ -99,7 +99,7 @@ duration (using the units `s` (seconds), `m` (minutes), `h` (hours), ### Purge jobs -Purge jobs are the jobs that Synapse run in the background to purge +Purge jobs are the jobs that Synapse runs in the background to purge expired events from the database. They are only run if support for message retention policies is enabled in the server's configuration. If no configuration for purge jobs is configured by the server admin, @@ -188,4 +188,3 @@ If you want to reclaim the freed disk space anyway and return it to the operating system, the server admin needs to run `VACUUM FULL;` (or `VACUUM;` for SQLite databases) on Synapse's database (see the related [PostgreSQL documentation](https://www.postgresql.org/docs/current/sql-vacuum.html)). - From 7ba98a2874c6c14c3f2ceb9b633a13d3e7345065 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Tue, 7 Jan 2020 15:14:33 +0000 Subject: [PATCH 6/7] Incorporate review --- docs/message_retention_policies.md | 55 +++++++++++++++--------------- 1 file changed, 28 insertions(+), 27 deletions(-) diff --git a/docs/message_retention_policies.md b/docs/message_retention_policies.md index 84f092523..42b637516 100644 --- a/docs/message_retention_policies.md +++ b/docs/message_retention_policies.md @@ -6,6 +6,9 @@ follow the semantics described in [MSC1763](https://github.com/matrix-org/matrix-doc/blob/matthew/msc1763/proposals/1763-configurable-retention-periods.md), and allow server and room admins to configure how long messages should be kept in a homeserver's database before being purged from it. +**Please note that, as this feature isn't part of the Matrix +specification yet, this implementation is to be considered as +experimental.** A message retention policy is mainly defined by its `max_lifetime` parameter, which defines how long a message can be kept around after @@ -40,30 +43,6 @@ process and store that event until it's picked up by the next purge job, though it will always hide it from clients. -## Room configuration - -To configure a room's message retention policy, a room's admin or -moderator needs to send a state event in that room with the type -`m.room.retention` and the following content: - -```json -{ - "max_lifetime": ... -} -``` - -In this event's content, the `max_lifetime` parameter has the same -meaning as previously described, and needs to be expressed in -milliseconds. The event's content can also include a `min_lifetime` -parameter, which has the same meaning and limited support as previously -described. - -Note that over every server in the room, only the ones with support for -message retention policies will actually remove expired events. While -we plan to eventually enable this support by default in Synapse, this -isn't currently the case. - - ## Server configuration Support for this feature can be enabled and configured in the @@ -103,9 +82,8 @@ Purge jobs are the jobs that Synapse runs in the background to purge expired events from the database. They are only run if support for message retention policies is enabled in the server's configuration. If no configuration for purge jobs is configured by the server admin, -Synapse will run one daily that will handle every room with a message -retention policy (or, if the server has a default policy configured, -every room it knows), which should be enough in most cases. +Synapse will use a default configuration, which is described in the +[sample configuration file](https://github.com/matrix-org/synapse/blob/v1.7.3/docs/sample_config.yaml#L332-L393). Some server admins might want a finer control on when events are removed depending on an event's room's policy. This can be done by setting the @@ -177,6 +155,29 @@ Like other settings in this section, these parameters can be expressed either as a duration or as a number of milliseconds. +## Room configuration + +To configure a room's message retention policy, a room's admin or +moderator needs to send a state event in that room with the type +`m.room.retention` and the following content: + +```json +{ + "max_lifetime": ... +} +``` + +In this event's content, the `max_lifetime` parameter has the same +meaning as previously described, and needs to be expressed in +milliseconds. The event's content can also include a `min_lifetime` +parameter, which has the same meaning and limited support as previously +described. + +Note that over every server in the room, only the ones with support for +message retention policies will actually remove expired events. This +support is currently not enabled by default in Synapse. + + ## Note on reclaiming disk space While purge jobs actually delete data from the database, the disk space From 3675fb9bc6b0f9fd068bee443c1499042359ee99 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Tue, 7 Jan 2020 15:15:16 +0000 Subject: [PATCH 7/7] Fix reference --- docs/message_retention_policies.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/message_retention_policies.md b/docs/message_retention_policies.md index 42b637516..c4888c81b 100644 --- a/docs/message_retention_policies.md +++ b/docs/message_retention_policies.md @@ -83,7 +83,7 @@ expired events from the database. They are only run if support for message retention policies is enabled in the server's configuration. If no configuration for purge jobs is configured by the server admin, Synapse will use a default configuration, which is described in the -[sample configuration file](https://github.com/matrix-org/synapse/blob/v1.7.3/docs/sample_config.yaml#L332-L393). +[sample configuration file](https://github.com/matrix-org/synapse/blob/master/docs/sample_config.yaml#L332-L393). Some server admins might want a finer control on when events are removed depending on an event's room's policy. This can be done by setting the