Farside ratelimiting has been updated to return a 429 when a user
exceeds 1 request/sec. This should help eliminate a lot of scraping type
behavior that instance maintainers have been dealing with from Farside
lately.
Service changes:
- Teddit removed (not maintained)
- Bibliogram replaced by Proxigram
- Libreddit merged with redlib
Teddit is no longer maintained
Libreddit has been forked to redlib, which seems to be actively trying
to work around the changes to Reddit's API.
Libreddit instances are now a mirror of redlib instances for the time
being.
Teddit's instance file contains null URL entries, which was breaking the
nightly build. This ensures that the URL exists for an entry before
continuing with processing.
From the recent changes to twitter/X, it sounds like guest accounts are
now required for nitter, which are more easily rate limited. To avoid
any impact from Farside, the instances are now health checked in the
nightly build using https://status.d420.de (this doesn't seem to be
directly associated with the nitter maintainers, so might not be
entirely future-proof).
The ST website has been down for > 1 week, which seems to indicate that
it isn't coming back online anytime soon. Should reevaluate later to see
if it's back. Individual instances seem to be working fine.
searx.space includes metrics for instance uptime, which is now
implemented as part of farside's nightly build. Accordingly, the
instance availability task built into farside now excludes searxng
instances.
Closes#95
The cloudflare filter has been added back into the nightly build. Now
that the filtering method uses direct querying of the instance IP(s), it
should be more reliable than the namespace lookup (and more accurate).
services.json has been updated with the latest filtered results from
services-full.json as well.
The majority of searx instances returned by
https://searx.space/data/instances.json seem to be running non-release
versions of searx (i.e. versions like "2022.11.06-ae54c7d5" and not
"1.0.0"). Since the version itself doesn't indicate reliability alone
imo, I don't think it's necessary to exclude instances based on this
criteria in the auto-update nightly build.
Also removes bibliogram from the auto-updater
Bibliogram is discontinued, and many instances are going offline as a
result. This clears out the ones that have already been deactivated, but
the better solution would probably be to stop supporting bibliogram.
The cloudflare filter, when performed as part of the github action
workflow, doesn't seem to work nearly as reliably when run on an actual
machine.
The farside server will instead run the un-cloudflare script whenever it
pulls in new changes to services-full.json, which should be a much more
reliable approach to filtering out cloudflare instances.
If dig returns exit code 9 (no reply from server) when checking an
instance for cloudflare records, it shouldn't fail the CI build but
rather just skip adding the instance to the non-cloudflare services
list.
This should be re-evaluated soon to see if the CI build routinely has
issues with getting a server reply. If so, a different approach needs to
be taken to check if an instance is using cloudflare.
This updates the services json file to exclude all instances that are
detected to be using Cloudflare nameservers.
A separate "services-full.json" file will continue to be tracked in the
repo, which will include the full list of all instances for each
service and can be used with the `FARSIDE_SERVICES_JSON` environment
variable for anyone wanting to access the full instance list for each
service.
See #43
Wikiless updated their instance json with a couple of changes that broke
Farside's auto update workflow:
- The protocol for each instance is now included by default (no need to prepend
"https://")
- The instances are differentiated between regular, onion, and i2p (no need to
check for ".onion" in regular instance URLs)
Added a new seperate service for only redirecting to SearXNG instances.
Note that plain "searx" redirects will use both SearX and SearXNG
instances for those who don't have a preference between the two.
Closes#23
- teddit (https://teddit.net/about)
A free and open source alternative Reddit front-end focused on
privacy. Inspired by the Nitter project.
- Piped (https://github.com/TeamPiped/Piped)
An alternative privacy-friendly YouTube frontend which is efficient
by design.
- SimplyTranslate (https://simple-web.org/projects/simplytranslate.html)
We aim to provide fast and private translations to the user without
wasting much overhead for extensive styling or JavaScript
___
Also adds SimplyTranslate to the github pipeline, since they provide a
list of the service's public instances.
Closes#4
The steps taken to commit the changes to services.json were overly complicated.
This simplifies the steps to just `exit 0` if there are no changes to commit.
* Create nightly update workflow for instances
A nightly GitHub Actions CI workflow has been added to fetch new
instances of supported services within Farside.
Currently only Searx is supported, but obviously others could be added
if there are similarly easy ways to fetch and filter instances
programmatically.
services.json has also been updated with the initial results of the
workflow script.
* Set headers for every HTTPoison request
This serves as a workaround for bot blocking via filtron.
* Expand filtering of searx instances
New filter enforces:
- No Cloudflare
- Good TLS config
- Good HTTP header config
- Vanilla instances or forks
- Instances with 100% search success