Noah Levitt
6edfbddd64
Merge branch 'master' into qa
...
* master:
oops bump version
2017-06-07 13:08:31 -07:00
Noah Levitt
02e1c88fac
oops bump version
2017-06-07 13:08:23 -07:00
Noah Levitt
da6e07fb61
Merge branch 'master' into qa
...
* master:
use %r instead of calling repr()
2017-06-07 13:07:51 -07:00
Noah Levitt
4d7f4518b5
use %r instead of calling repr()
2017-06-07 13:07:42 -07:00
Noah Levitt
99c8ebcc8b
Merge branch 'master' into qa
...
* master:
oops, should have bumped version number after merging pull requests
add relocated behavior file with updated copyright
enable huffpostslides.js
2017-06-07 08:52:12 -07:00
Noah Levitt
65adc11d95
oops, should have bumped version number after merging pull requests
2017-06-07 08:51:21 -07:00
Noah Levitt
39fb811d13
Merge pull request #41 from galgeek/ARI-4868
...
ARI-4868 behavior for Huffington Post slideshow
2017-06-02 14:41:02 -07:00
Noah Levitt
5e38a9755e
Merge pull request #42 from galgeek/loginAndReloadSeed
...
login and reload original url if navigated away
2017-06-02 14:03:51 -07:00
Barbara Miller
d41f30cbc7
Merge branch 'loginAndReloadSeed' into qa
2017-06-02 13:40:36 -07:00
Barbara Miller
a0330d9716
updates per Noah's review
2017-06-02 13:27:01 -07:00
Barbara Miller
830b0eef89
undo post-login nav (ARI-5385 and/or ARI-5386)
2017-06-02 12:47:19 -07:00
Noah Levitt
f2227e6759
have travis-ci test against python 3.5 and 3.6 too
2017-05-26 13:28:00 -07:00
Noah Levitt
bdc0badec3
rewrite frontier.scope_and_schedule_outlinks() to use batch rethinkdb queries, because we have witnessed the method running for hours(!)
2017-05-26 13:24:14 -07:00
Noah Levitt
0a19770ba7
Merge branch 'master' into qa
...
* master:
remove stray logging
2017-05-24 11:36:15 -07:00
Noah Levitt
d904daea9c
remove stray logging
2017-05-24 11:36:06 -07:00
Noah Levitt
f0b9020c0a
Merge branch 'master' into qa
...
* master:
use "ttl" for updated doublethink svc reg api
2017-05-23 11:35:19 -07:00
Noah Levitt
ac543ee5b6
use "ttl" for updated doublethink svc reg api
2017-05-23 11:33:04 -07:00
Barbara Miller
079db762d4
add relocated behavior file with updated copyright
2017-05-22 12:38:50 -07:00
Barbara Miller
d7c31be8d0
enable huffpostslides.js
2017-05-22 12:32:28 -07:00
Noah Levitt
9c8f626c38
Merge branch 'master' into qa
...
* master:
fix exception from ReachedLimit.__repr__ when it has been instantiated implicitly and __init__ was not called
improve thread_raise() so that the new tests pass
even more, better failing tests for thread_raise
failing test for forthcoming behavior of thread_raise
2017-05-16 15:47:27 -07:00
Noah Levitt
89e7c8b079
fix exception from ReachedLimit.__repr__ when it has been instantiated implicitly and __init__ was not called
2017-05-16 15:47:18 -07:00
Noah Levitt
31dc6a2d97
improve thread_raise() so that the new tests pass
...
1. If thread is not currently accepting exceptions, queue it and raise if and
when it does start accepting them. This fixes problem of thread_raise
exceptions being ignored when raised just before the target thread starts
accepting exceptions.
2. Avoid problems caused by raising multiple exceptions in the same
thread in quick succession by ensuring that only one is actually raised for
a given `with` block. This type of occurrence had been putting brozzler into
a borked/frozen state.
2017-05-16 14:20:53 -07:00
Noah Levitt
d514eaec15
even more, better failing tests for thread_raise
2017-05-16 14:00:10 -07:00
Noah Levitt
d2525e2e87
failing test for forthcoming behavior of thread_raise
2017-05-15 16:20:20 -07:00
Noah Levitt
e5371ef0b0
Merge branch 'master' into qa
...
* master:
recognize ConnectionError (of which ConnectionResetError is a subclass) in _warcprox_write_record as a proxy error
2017-05-12 10:04:06 -07:00
Noah Levitt
60c5a7c1c4
recognize ConnectionError (of which ConnectionResetError is a subclass) in _warcprox_write_record as a proxy error
2017-05-12 10:03:53 -07:00
Barbara Miller
b2a4fbb17f
Merge branch 'ARI-4868' into qa
2017-05-10 15:09:00 -07:00
Barbara Miller
35977f6276
enable huffpostslides.js
2017-05-10 15:08:29 -07:00
Barbara Miller
054625b8a5
Merge pull request #40 from BitBaron/ari-4960
...
Crawl Google Calendar for fortstjames.ca
2017-05-09 14:12:48 -07:00
Noah Levitt
0c45ca2211
Merge branch 'master' into qa
...
* master:
do a better job of making sure to shut down the browser when brozzle-page is killed
2017-05-03 16:43:38 -07:00
Noah Levitt
b4bf17df9b
do a better job of making sure to shut down the browser when brozzle-page is killed
2017-05-03 16:43:31 -07:00
Noah Levitt
c3637ecb35
Merge branch 'master' into qa
...
* master:
handle another rethinkdb outage corner case
2017-05-01 14:12:51 -07:00
Noah Levitt
9d4cbbf6eb
handle another rethinkdb outage corner case
2017-05-01 14:12:43 -07:00
Noah Levitt
15a3da61c6
Merge branch 'master' into qa
...
* master:
BrozzlerWorkerThread separate from MainThread to avoid SIGTERM/SIGINT raising exception inside of some rethinkdb code or other sensitive code in that BrozzlerWorker.run() calls
2017-05-01 13:46:28 -07:00
Noah Levitt
389db01458
BrozzlerWorkerThread separate from MainThread to avoid SIGTERM/SIGINT raising exception inside of some rethinkdb code or other sensitive code in that BrozzlerWorker.run() calls
2017-05-01 13:46:19 -07:00
Noah Levitt
69d8571871
Merge branch 'master' into qa
...
* master:
re-claim sites after 1 hour instead of 2 so that sites don't have to wait as long to be brozzled again in case of kill -9 brozzler-worker
add a github PR template for this repo
update headless chrome instructions for regular chrome builds
use the new api `with brozzler.thread_accept_exceptions()`
refactor thread_raise safety to use a context manager
allow this stupid test to fail
improve messaging when brozzler-stop-crawl is passed nonexistent seed/job id
safen up brozzler.thread_raise() to avoid interrupting rethinkdb transactions and such
2017-05-01 13:00:34 -07:00
Noah Levitt
52433ade78
re-claim sites after 1 hour instead of 2 so that sites don't have to wait as long to be brozzled again in case of kill -9 brozzler-worker
2017-05-01 13:00:04 -07:00
Noah Levitt
000d40c4dc
Merge pull request #39 from bnewbold/bnewbold-pr-template
...
add a github PR template for this repo
2017-04-26 14:34:32 -07:00
bnewbold
83552eb444
add a github PR template for this repo
2017-04-26 14:10:24 -07:00
Noah Levitt
d972919db0
Merge pull request #36 from nlevitt/safe-thread-raise
...
safen up brozzler.thread_raise() to avoid interrupting rethinkdb tran…
2017-04-26 11:15:02 -07:00
Noah Levitt
27ee8d53f8
Merge pull request #38 from ato/headless-doc
...
update headless chrome instructions for regular chrome builds
2017-04-25 09:39:43 -07:00
Alex Osborne
69aba8b762
update headless chrome instructions for regular chrome builds
...
Also make it clearer that this hasn't been tested much.
2017-04-25 15:00:25 +10:00
Noah Levitt
dcf4811470
Merge branch 'master' into safe-thread-raise
2017-04-24 20:06:37 -07:00
Noah Levitt
d916b68ab9
use the new api with brozzler.thread_accept_exceptions()
2017-04-24 20:02:34 -07:00
Noah Levitt
0953e6972e
refactor thread_raise safety to use a context manager
2017-04-24 19:51:51 -07:00
Noah Levitt
f140e5bdbd
allow this stupid test to fail
2017-04-21 12:17:11 -07:00
Noah Levitt
ba519d7288
improve messaging when brozzler-stop-crawl is passed nonexistent seed/job id
2017-04-20 18:04:17 -07:00
Noah Levitt
7706bab8b8
safen up brozzler.thread_raise() to avoid interrupting rethinkdb transactions and such
2017-04-20 17:08:16 -07:00
Noah Levitt
4f5553954c
Merge branch 'master' into qa
...
* master:
quote that shell meta character
need warcprox in python path for travis tests now
2017-04-19 08:58:47 -07:00
Noah Levitt
b3fa7a4e39
quote that shell meta character
2017-04-18 18:46:59 -07:00