33 Commits

Author SHA1 Message Date
Misty De Méo
39c889db47 cluster tests: structlog 2025-02-21 09:17:05 -08:00
Alex Dempsey
8b23430a87 Use black, enforce with GitHub Actions 2024-02-08 12:07:41 -08:00
Noah Levitt
e23fa68d65 fix bug clobbering own changes to parent_page
and some other tweaks (python 3.5+, pytest logging config, ...)
2019-10-17 13:47:54 -07:00
Noah Levitt
0a1360ab25 don't use localhost for test http server...
... because apparently sometimes chromium bypasses the proxy for local
addresses
2019-05-15 18:49:18 -07:00
Noah Levitt
433b201b52 use logging.warning() to quiet py37 warnings 2019-04-09 01:43:38 -07:00
Noah Levitt
85c6ac0ab2 fix next travis-ci problem 2019-04-02 12:05:08 -07:00
Noah Levitt
1ef717fa75 test exposing bug that we don't send warcprox-meta
when pushing stitched-up video with WARCPROX_WRITE_RECORD
2018-09-18 01:05:18 -07:00
Noah Levitt
3c27132aaa test for youtube-dl stitch-up 2018-08-15 17:42:53 -07:00
Noah Levitt
d4db8ba9bc is test_time_limit is failing because of timing?
give it up to ten seconds to mark the job finished
2018-06-25 10:35:24 -05:00
Noah Levitt
331d07fe88 these ssurts are strings too 2018-05-16 17:11:08 -07:00
Noah Levitt
5bb392ec7c ssurts are strings now
because they're friendlier that way in rethinkdb
2018-05-16 16:43:10 -07:00
Noah Levitt
fc05cac338 ok seriously tests 2018-05-14 15:38:28 -07:00
Noah Levitt
05f8ab3495 fix more tests for new approach sans scope['surt'] 2018-05-14 15:38:28 -07:00
Noah Levitt
d7512fbeb6 move time limit enforcement
now it's next to stop request enforcement which makes more sense and
supports more timely action
2018-03-01 11:28:30 -08:00
Noah Levitt
8505720c41 fix tests 2018-02-02 15:11:26 -08:00
Noah Levitt
384c877e9a new test exposing problem where each hashtag visited causes a page load, if page redirects 2017-09-27 14:08:28 -07:00
Noah Levitt
8256a34b4f implement resilience to warcprox outage, i.e. deal with brozzler.ProxyError in brozzler-worker 2017-04-18 17:54:12 -07:00
Noah Levitt
df7734f2ca new command line utility brozzler-stop-crawl, with tests 2017-04-14 18:06:15 -07:00
Noah Levitt
3d47805ec1 new model for crawling hashtags, each one is no longer a top-level page 2017-03-27 12:15:49 -07:00
Noah Levitt
a836269e95 remove some vestiges of old proxy stuff 2017-03-24 16:04:43 -07:00
Noah Levitt
934190084c Refactor the way the proxy is configured. Job/site settings "proxy" and "enable_warcprox_features" are gone. Brozzler-worker now has mutually exclusive options --proxy and --warcprox-auto. --warcprox-auto means find an instance of warcprox in the service registry, and enable warcprox features. --proxy is provided, determines if proxy is warcprox by consulting http://{proxy_address}/status (see https://github.com/internetarchive/warcprox/commit/8caae0d7d3), and enables warcprox features if so. 2017-03-24 13:55:23 -07:00
Noah Levitt
242ff51ec7 fix bug with seed redirects where scope change was applied too late to affect scoping of outlinks from the seed (with automated tests) 2017-03-06 15:13:40 -08:00
Noah Levitt
569af05b11 rethinkstuff is now "doublethink 2017-03-02 12:48:45 -08:00
Noah Levitt
5c684779e5 pywb support for thumbnail: and screenshot: urls 2017-01-31 10:26:38 -08:00
Noah Levitt
4b6831b464 new flag Page.blocked_by_robots 2017-01-30 10:43:25 -08:00
Noah Levitt
86ac48d6c3 generalized support for login doing automatic detection of login form on a page 2016-12-19 17:30:09 -08:00
Noah Levitt
72816d1058 don't check robots.txt when scheduling a new site to be crawled, but mark the seed Page as needs_robots_check, and delegate the robots check to brozzler-worker; new test of robots.txt adherence 2016-11-16 12:23:59 -08:00
Noah Levitt
5ac8994a24 rename webconsole to dashboard 2016-11-04 17:46:23 -07:00
Mouse Reeve
2215aaab21 Use warcprox if enable_warcprox_features is true 2016-10-18 17:39:33 -07:00
Noah Levitt
a370e7b987 tiny fix, and now the test passes for me 2016-10-14 19:21:26 -07:00
Noah Levitt
27452990ee toward getting initial tests to pass 2016-10-14 18:26:48 -07:00
Noah Levitt
56e651baeb working on basic integration tests 2016-10-13 17:12:35 -07:00
Noah Levitt
c864499a64 starting to create a framework for testing 2016-09-14 17:06:49 -07:00