This implements an allow list for content types for which Synapse will attempt URL preview. If a URL resolves to a resource with a content type which isn't in the list, the download will terminate immediately.
This makes sense given that Synapse would never successfully generate a URL preview for such files in the first place, and helps prevent issues with streaming media servers, such as #8302.
Signed-off-by: Denis Kasak dkasak@termina.org.uk
* Splits the logic for parsing HTML from the resource handling code.
* Fix a circular import in the oEmbed code (which uses the HTML parsing code).
* Renames some of the HTML parsing methods to:
* Make it clear which methods are "internal" to the module.
* Clarify what the methods do.
* Improved titles (fall back to the author name if there's not title) and include the site name.
* Handle photo/video payloads.
* Include the original URL in the Open Graph response.
* Fix the expiration time (by properly converting from seconds to milliseconds).
The major change is moving the decision of whether to use oEmbed
further up the call-stack. This reverts the _download_url method to
being a "dumb" functionwhich takes a single URL and downloads it
(as it was before #7920).
This also makes more minor refactorings:
* Renames internal variables for clarity.
* Factors out shared code between the HTML and rich oEmbed
previews.
* Fixes tests to preview an oEmbed image.
This adds the format to the request arguments / URL to
ensure that JSON data is returned (which is all that
Synapse supports).
This also adds additional error checking / filtering to the
configuration file to ignore XML-only providers.
Part of #9744
Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now.
`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
remove the stubbing out of `request.process`, so that `requestReceived` also renders the request via the appropriate resource.
Replace render() with a stub for now.
Prevents a SynapseError being raised inside of a IResolutionReceiver and instead opts to just return 0 results. This thus means that we have to lump a failed lookup and a blacklisted lookup together with the same error message, but the substitute should be generic enough to cover both cases.