mirror of
https://git.anonymousland.org/anonymousland/synapse.git
synced 2025-05-02 09:46:07 -04:00
Use attrs internally for the URL preview code & add documentation. (#10753)
This commit is contained in:
parent
a23f3abb9b
commit
89ba834818
5 changed files with 132 additions and 119 deletions
51
docs/development/url_previews.md
Normal file
51
docs/development/url_previews.md
Normal file
|
@ -0,0 +1,51 @@
|
|||
URL Previews
|
||||
============
|
||||
|
||||
The `GET /_matrix/media/r0/preview_url` endpoint provides a generic preview API
|
||||
for URLs which outputs [Open Graph](https://ogp.me/) responses (with some Matrix
|
||||
specific additions).
|
||||
|
||||
This does have trade-offs compared to other designs:
|
||||
|
||||
* Pros:
|
||||
* Simple and flexible; can be used by any clients at any point
|
||||
* Cons:
|
||||
* If each homeserver provides one of these independently, all the HSes in a
|
||||
room may needlessly DoS the target URI
|
||||
* The URL metadata must be stored somewhere, rather than just using Matrix
|
||||
itself to store the media.
|
||||
* Matrix cannot be used to distribute the metadata between homeservers.
|
||||
|
||||
When Synapse is asked to preview a URL it does the following:
|
||||
|
||||
1. Checks against a URL blacklist (defined as `url_preview_url_blacklist` in the
|
||||
config).
|
||||
2. Checks the in-memory cache by URLs and returns the result if it exists. (This
|
||||
is also used to de-duplicate processing of multiple in-flight requests at once.)
|
||||
3. Kicks off a background process to generate a preview:
|
||||
1. Checks the database cache by URL and timestamp and returns the result if it
|
||||
has not expired and was successful (a 2xx return code).
|
||||
2. Checks if the URL matches an oEmbed pattern. If it does, fetch the oEmbed
|
||||
response. If this is an image, replace the URL to fetch and continue. If
|
||||
if it is HTML content, use the HTML as the document and continue.
|
||||
3. If it doesn't match an oEmbed pattern, downloads the URL and stores it
|
||||
into a file via the media storage provider and saves the local media
|
||||
metadata.
|
||||
5. If the media is an image:
|
||||
1. Generates thumbnails.
|
||||
2. Generates an Open Graph response based on image properties.
|
||||
6. If the media is HTML:
|
||||
1. Decodes the HTML via the stored file.
|
||||
2. Generates an Open Graph response from the HTML.
|
||||
3. If an image exists in the Open Graph response:
|
||||
1. Downloads the URL and stores it into a file via the media storage
|
||||
provider and saves the local media metadata.
|
||||
2. Generates thumbnails.
|
||||
3. Updates the Open Graph response based on image properties.
|
||||
7. Stores the result in the database cache.
|
||||
4. Returns the result.
|
||||
|
||||
The in-memory cache expires after 1 hour.
|
||||
|
||||
Expired entries in the database cache (and their associated media files) are
|
||||
deleted every 10 seconds. The default expiration time is 1 hour from download.
|
Loading…
Add table
Add a link
Reference in a new issue