* master:
fix exception from ReachedLimit.__repr__ when it has been instantiated implicitly and __init__ was not called
improve thread_raise() so that the new tests pass
even more, better failing tests for thread_raise
failing test for forthcoming behavior of thread_raise
1. If thread is not currently accepting exceptions, queue it and raise if and
when it does start accepting them. This fixes problem of thread_raise
exceptions being ignored when raised just before the target thread starts
accepting exceptions.
2. Avoid problems caused by raising multiple exceptions in the same
thread in quick succession by ensuring that only one is actually raised for
a given `with` block. This type of occurrence had been putting brozzler into
a borked/frozen state.
* master:
BrozzlerWorkerThread separate from MainThread to avoid SIGTERM/SIGINT raising exception inside of some rethinkdb code or other sensitive code in that BrozzlerWorker.run() calls
* master:
re-claim sites after 1 hour instead of 2 so that sites don't have to wait as long to be brozzled again in case of kill -9 brozzler-worker
add a github PR template for this repo
update headless chrome instructions for regular chrome builds
use the new api `with brozzler.thread_accept_exceptions()`
refactor thread_raise safety to use a context manager
allow this stupid test to fail
improve messaging when brozzler-stop-crawl is passed nonexistent seed/job id
safen up brozzler.thread_raise() to avoid interrupting rethinkdb transactions and such
* master:
implement resilience to warcprox outage, i.e. deal with brozzler.ProxyError in brozzler-worker
have _warcprox_write_record also raise ProxyError when appropriate, and test this
* master:
fix robots.txt proxy down test by setting site.id (cached robots is stored by site.id, and other tests that ran earlier with no site.id were interfering); and test another kind of connection error, for whatever that's worth
* master:
raise brozzler.ProxyError in case of proxy error fetching robots.txt, doing youtube-dl, or doing raw fetch
raise new exception brozzler.ProxyError in case of proxy error browsing a page
make brozzle-page respect --proxy (no test for this!)
oops, version bump for previous commit
bubble up proxy errors fetching robots.txt, with unit test, and documentation