28 Commits

Author SHA1 Message Date
Noah Levitt
2398031010 let the OS pick an available port, to avoid what appear to be timing issues causing multiple browsers to choose the same port 2017-02-22 12:44:19 -08:00
Noah Levitt
b409e49cfa deprecate current scope rule syntax and create new syntax with slightly different semantics (to be documented), and add parent_url_regex scope rule; unit test for scoping 2017-02-15 16:46:45 -08:00
Noah Levitt
14e312e4c4 make sure site is not "claimed" when it's finished 2017-02-03 16:40:15 -08:00
Noah Levitt
a60878c5a7 support for resuming jobs, keeping track of each start and stop time, used to enforce time limits correctly 2017-02-03 14:56:12 -08:00
Noah Levitt
5c684779e5 pywb support for thumbnail: and screenshot: urls 2017-01-31 10:26:38 -08:00
Noah Levitt
4b6831b464 new flag Page.blocked_by_robots 2017-01-30 10:43:25 -08:00
Noah Levitt
5375b819dd missed a spot 2017-01-20 23:59:31 -08:00
Noah Levitt
011d814ee2 tests for dismissal of javascript dialogs (alert, prompt, confirm) 2017-01-13 11:46:42 -08:00
Noah Levitt
70b67942a5 restore handling of 420 Reached limit, with a rudimentary test 2016-12-22 13:44:09 -08:00
Noah Levitt
e5fb6cb4b9 add import missing from test 2016-12-21 19:19:34 -08:00
Noah Levitt
eabb0fb114 restore support for on_response and on_request, with an automated test for on_response 2016-12-21 18:35:55 -08:00
Noah Levitt
f7427219cf restore handling of "aw snap" or "he's dead jim" 2016-12-21 14:21:20 -08:00
Noah Levitt
86d6060a2d loosen the find_available_port test slightly, since it seems to be not 100% predictable for reasons i haven't investigated 2016-12-20 17:52:21 -08:00
Noah Levitt
b24b229cb2 how did i miss this file? 2016-12-20 11:13:48 -08:00
Noah Levitt
7a40822e64 forgot to git add new test data 2016-12-19 18:10:07 -08:00
Noah Levitt
86ac48d6c3 generalized support for login doing automatic detection of login form on a page 2016-12-19 17:30:09 -08:00
Noah Levitt
9bcec54f4b fix _find_available_port and its unit test 2016-12-07 14:08:34 -08:00
Noah Levitt
eed8b9ec30 little fixes 2016-12-07 11:20:10 -08:00
Noah Levitt
ce03381b92 move _find_available_ports to chrome.py, changing the way it works so that browser:9200 doesn't get stuck at 9201 forever, which pushes 9201 to 9202 etc, and add a unit test 2016-12-06 17:12:20 -08:00
Noah Levitt
72816d1058 don't check robots.txt when scheduling a new site to be crawled, but mark the seed Page as needs_robots_check, and delegate the robots check to brozzler-worker; new test of robots.txt adherence 2016-11-16 12:23:59 -08:00
Noah Levitt
24cc8377fb robots.txt for testing 2016-11-16 12:12:17 -08:00
Noah Levitt
3aead6de93 monkey-patch reppy to support substring user-agent matching 2016-11-16 11:41:34 -08:00
Noah Levitt
5ac8994a24 rename webconsole to dashboard 2016-11-04 17:46:23 -07:00
Mouse Reeve
2215aaab21 Use warcprox if enable_warcprox_features is true 2016-10-18 17:39:33 -07:00
Noah Levitt
a370e7b987 tiny fix, and now the test passes for me 2016-10-14 19:21:26 -07:00
Noah Levitt
27452990ee toward getting initial tests to pass 2016-10-14 18:26:48 -07:00
Noah Levitt
56e651baeb working on basic integration tests 2016-10-13 17:12:35 -07:00
Noah Levitt
c864499a64 starting to create a framework for testing 2016-09-14 17:06:49 -07:00