anonymousland-synapse

mirror of https://git.anonymousland.org/anonymousland/synapse.git synced 2024-10-01 11:49:51 -04:00

Author	SHA1	Message	Date
Denis Kasak	337f38cac3	Implement a content type allow list for URL previews (#11936 ) This implements an allow list for content types for which Synapse will attempt URL preview. If a URL resolves to a resource with a content type which isn't in the list, the download will terminate immediately. This makes sense given that Synapse would never successfully generate a URL preview for such files in the first place, and helps prevent issues with streaming media servers, such as #8302. Signed-off-by: Denis Kasak dkasak@termina.org.uk	2022-02-10 15:43:01 +00:00
Patrick Cloke	807efd26ae	Support rendering previews with data: URLs in them (#11767 ) Images which are data URLs will no longer break URL previews and will properly be "downloaded" and thumbnailed.	2022-01-24 08:58:18 -05:00
Philippe Daouadi	15ffc4143c	Fix preview of imgur and Tenor URLs. (#11669 ) By scraping Open Graph information from the HTML even when an autodiscovery endpoint is found. The results are then combined to capture as much information as possible from the page.	2022-01-18 13:20:24 -05:00
Patrick Cloke	eb39da6782	Move HTML parsing to a separate file for URL previews. (#11566 ) * Splits the logic for parsing HTML from the resource handling code. * Fix a circular import in the oEmbed code (which uses the HTML parsing code). * Renames some of the HTML parsing methods to: * Make it clear which methods are "internal" to the module. * Clarify what the methods do.	2021-12-13 17:55:07 +00:00
Patrick Cloke	9b90b9454b	Add type hints to media repository storage module (#11311 )	2021-11-12 11:05:26 -05:00
Patrick Cloke	b3e843be88	Fix URL preview errors when previewing XML documents. (#11196 )	2021-10-27 14:48:02 +00:00
Patrick Cloke	efd0074ab7	Ensure each charset is attempted only once during media preview. (#11089 ) There's no point in trying more than once since it is guaranteed to continually fail.	2021-10-14 18:51:44 +00:00
Patrick Cloke	e2f0b49b3f	Attempt different character encodings when previewing a URL. (#11077 ) This follows similar logic to BeautifulSoup where we attempt different character encodings until we find one which works.	2021-10-14 10:17:20 -04:00
Patrick Cloke	732bbf6737	Be more lenient when parsing the version for oEmbed responses. (#11065 )	2021-10-13 07:00:07 -04:00
Patrick Cloke	1b112840d2	Autodiscover oEmbed endpoint from returned HTML (#10822 ) Searches the returned HTML for an oEmbed endpoint using the autodiscovery mechanism (`<link rel=...>`), and will request it to generate the preview.	2021-10-08 14:14:42 -04:00
Sean Quah	2be0fde3d6	Fix empty `url_cache_thumbnails/yyyy-mm-dd/` directories being left behind (#10924 )	2021-09-29 10:24:37 +01:00
Sean Quah	f7768f62cb	Avoid storing URL cache files in storage providers (#10911 ) URL cache files are short-lived and it does not make sense to offload them (eg. to the cloud) or back them up.	2021-09-27 12:55:27 +01:00
Patrick Cloke	bb7fdd821b	Use direct references for configuration variables (part 5). (#10897 )	2021-09-24 07:25:21 -04:00
Erik Johnston	50022cff96	Add reactor to `SynapseRequest` and fix up types. (#10868 )	2021-09-24 11:01:25 +01:00
Patrick Cloke	6fc8be9a1b	Include more information in oEmbed previews. (#10819 ) * Improved titles (fall back to the author name if there's not title) and include the site name. * Handle photo/video payloads. * Include the original URL in the Open Graph response. * Fix the expiration time (by properly converting from seconds to milliseconds).	2021-09-22 09:45:20 -04:00
Patrick Cloke	ba7a91aea5	Refactor oEmbed previews (#10814 ) The major change is moving the decision of whether to use oEmbed further up the call-stack. This reverts the _download_url method to being a "dumb" functionwhich takes a single URL and downloads it (as it was before #7920). This also makes more minor refactorings: * Renames internal variables for clarity. * Factors out shared code between the HTML and rich oEmbed previews. * Fixes tests to preview an oEmbed image.	2021-09-21 16:09:57 +00:00
Patrick Cloke	b93259082c	Add missing type hints to non-client REST servlets. (#10817 ) Including admin, consent, key, synapse, and media. All REST servlets (the synapse.rest module) now require typed method definitions.	2021-09-15 08:45:32 -04:00
Patrick Cloke	89ba834818	Use attrs internally for the URL preview code & add documentation. (#10753 )	2021-09-07 13:10:34 +00:00
Patrick Cloke	e2481dbe93	Allow configuration of the oEmbed URLs. (#10714 ) This adds configuration options (under an `oembed` section) to configure which URLs are matched to use oEmbed for URL previews.	2021-08-31 18:37:07 -04:00
sri-vidyut	8e1febc6a1	Support underscores (in addition to hyphens) for charset detection. (#10410 )	2021-07-27 17:29:42 +00:00
Patrick Cloke	5db118626b	Add a return type to parse_string. (#10438 ) And set the required attribute in a few places which will error if a parameter is not provided.	2021-07-21 09:47:56 -04:00
Jonathan de Jong	98aec1cc9d	Use inline type hints in `handlers/` and `rest/`. (#10382 )	2021-07-16 18:22:36 +01:00
Jonathan de Jong	4b965c862d	Remove redundant "coding: utf-8" lines (#9786 ) Part of #9744 Removes all redundant `# -- coding: utf-8 --` lines from files, as python 3 automatically reads source code as utf-8 now. `Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`	2021-04-14 15:34:27 +01:00
Patrick Cloke	44bb881096	Add type hints to expiring cache. (#9730 )	2021-04-06 08:58:18 -04:00
Erik Johnston	b5efcb577e	Make it possible to use dmypy (#9692 ) Running `dmypy run` will do a `mypy` check while spinning up a daemon that makes rerunning `dmypy run` a lot faster. `dmypy` doesn't support `follow_imports = silent` and has `local_partial_types` enabled, so this PR enables those options and fixes the issues that were newly raised. Note that `local_partial_types` will be enabled by default in upcoming mypy releases.	2021-03-26 16:49:46 +00:00
Patrick Cloke	b7748d3c00	Import HomeServer from the proper module. (#9665 )	2021-03-23 07:12:48 -04:00
Patrick Cloke	55da8df078	Fix additional type hints from Twisted 21.2.0. (#9591 )	2021-03-12 11:37:57 -05:00
Patrick Cloke	a0bc9d387e	Use the proper Request in type hints. (#9515 ) This also pins the Twisted version in the mypy job for CI until proper type hints are fixed throughout Synapse.	2021-03-01 12:23:46 -05:00
Tim Leung	ddb240293a	Add support for no_proxy and case insensitive env variables (#9372 ) ### Changes proposed in this PR - Add support for the `no_proxy` and `NO_PROXY` environment variables - Internally rely on urllib's [`proxy_bypass_environment`](`bdb941be42/Lib/urllib/request.py (L2519)`) - Extract env variables using urllib's `getproxies`/[`getproxies_environment`](`bdb941be42/Lib/urllib/request.py (L2488)`) which supports lowercase + uppercase, preferring lowercase, except for `HTTP_PROXY` in a CGI environment This does contain behaviour changes for consumers so making sure these are called out: - `no_proxy`/`NO_PROXY` is now respected - lowercase `https_proxy` is now allowed and taken over `HTTPS_PROXY` Related to #9306 which also uses `ProxyAgent` Signed-off-by: Timothy Leung tim95@hotmail.co.uk	2021-02-26 17:37:57 +00:00
Eric Eastwood	0a00b7ff14	Update black, and run auto formatting over the codebase (#9381 ) - Update black version to the latest - Run black auto formatting over the codebase - Run autoformatting according to [`docs/code_style.md `](`80d6dc9783/docs/code_style.md`) - Update `code_style.md` docs around installing black to use the correct version	2021-02-16 22:32:34 +00:00
Patrick Cloke	0963d39ea6	Handle additional errors when previewing URLs. (#9333 ) * Handle the case of lxml not finding a document tree. * Parse the document encoding from the XML tag.	2021-02-08 12:33:30 -05:00
Patrick Cloke	4937fe3d6b	Try to recover from unknown encodings when previewing media. (#9164 ) Treat unknown encodings (according to lxml) as UTF-8 when generating a preview for HTML documents. This isn't fully accurate, but will hopefully give a reasonable title and summary.	2021-01-26 07:32:17 -05:00
Patrick Cloke	d34c6e1279	Add type hints to media rest resources. (#9093 )	2021-01-15 10:57:37 -05:00
Patrick Cloke	1f3748f033	Do not raise a 500 exception when previewing empty media. (#8883 )	2020-12-07 10:00:08 -05:00
Richard van der Hoff	11c9e17738	Add type annotations to SimpleHttpClient (#8372 )	2020-09-24 15:47:20 +01:00
Patrick Cloke	aec294ee0d	Use slots in attrs classes where possible (#8296 ) slots use less memory (and attribute access is faster) while slightly limiting the flexibility of the class attributes. This focuses on objects which are instantiated "often" and for short periods of time.	2020-09-14 12:50:06 -04:00
Patrick Cloke	4e874ed593	Remove unnecessary maybeDeferred calls (#8044 )	2020-08-07 09:44:48 -04:00
David Vo	4dd27e6d11	Reduce unnecessary whitespace in JSON. (#7372 )	2020-08-07 08:02:55 -04:00
Erik Johnston	a7bdf98d01	Rename database classes to make some sense (#8033 )	2020-08-05 21:38:57 +01:00
Patrick Cloke	68626ff8e9	Convert the remaining media repo code to async / await. (#7947 )	2020-07-27 14:40:11 -04:00
Patrick Cloke	3fc8fdd150	Support oEmbed for media previews. (#7920 ) Fixes previews of Twitter URLs by using their oEmbed endpoint to grab content.	2020-07-27 07:50:44 -04:00
Erik Johnston	5cdca53aa0	Merge different Resource implementation classes (#7732 )	2020-07-03 19:02:19 +01:00
Erik Johnston	b44bdd7f7b	Support running multiple media repos. (#7706 ) This requires a new config option to specify which media repo should be responsible for running background jobs to e.g. clear out expired URL preview caches.	2020-06-17 14:13:30 +01:00
Dagfinn Ilmari Mannsåker	a3f11567d9	Replace all remaining six usage with native Python 3 equivalents (#7704 )	2020-06-16 08:51:47 -04:00
Michael Kaye	5308239d5d	Reduce logging verbosity of URL cache cleanup. (#7295 )	2020-04-22 07:45:16 -04:00
Andrew Morgan	a48138784e	Allow specifying the value of Accept-Language header for URL previews (#7265 )	2020-04-15 13:35:29 +01:00
Patrick Cloke	caec7d4fa0	Convert some of the media REST code to async/await (#7110 )	2020-03-20 07:20:02 -04:00
Erik Johnston	b0a66ab83c	Fixup synapse.rest to pass mypy (#6732 )	2020-01-20 17:38:21 +00:00
Erik Johnston	4a33a6dd19	Move background update handling out of store	2019-12-05 11:11:26 +00:00
Richard van der Hoff	ef1a85e773	Fix startup error when http proxy is defined. (#6421 ) Guess I only tested this on python 2 :/ Fixes #6419.	2019-11-26 18:10:50 +00:00

1 2 3 4

168 Commits