31 Commits

Author SHA1 Message Date
Alex Dempsey
8b23430a87 Use black, enforce with GitHub Actions 2024-02-08 12:07:41 -08:00
Noah Levitt
f8165dc02b work around pytest issue until fix is out
https://github.com/pytest-dev/pytest/issues/5257
2019-05-15 18:46:21 -07:00
Noah Levitt
8dfd92cf7f fix this utility 2019-04-09 01:44:14 -07:00
Noah Levitt
9459ed40d0 fix typo 2019-04-04 12:38:41 -07:00
Noah Levitt
85c6ac0ab2 fix next travis-ci problem 2019-04-02 12:05:08 -07:00
Noah Levitt
18b4a26db6 porting ansible config to xenial
no more upstart, switch to daemontools, among other things
2019-03-22 23:50:46 -07:00
Noah Levitt
d19e139101 vagrant readme fixes (thanks funkyfuture) 2018-08-17 10:31:01 -07:00
Noah Levitt
56e01b9078 give vagrant vm a name in virtualbox 2018-02-13 17:05:45 -08:00
Noah Levitt
ac543ee5b6 use "ttl" for updated doublethink svc reg api 2017-05-23 11:33:04 -07:00
Noah Levitt
8256a34b4f implement resilience to warcprox outage, i.e. deal with brozzler.ProxyError in brozzler-worker 2017-04-18 17:54:12 -07:00
Noah Levitt
3d47805ec1 new model for crawling hashtags, each one is no longer a top-level page 2017-03-27 12:15:49 -07:00
Noah Levitt
934190084c Refactor the way the proxy is configured. Job/site settings "proxy" and "enable_warcprox_features" are gone. Brozzler-worker now has mutually exclusive options --proxy and --warcprox-auto. --warcprox-auto means find an instance of warcprox in the service registry, and enable warcprox features. --proxy is provided, determines if proxy is warcprox by consulting http://{proxy_address}/status (see https://github.com/internetarchive/warcprox/commit/8caae0d7d3), and enables warcprox features if so. 2017-03-24 13:55:23 -07:00
Noah Levitt
12fb9eaa15 use urlcanon library for canonicalization, surtification, scope match rules 2017-03-15 14:59:51 -07:00
Noah Levitt
c90c73372e need $DISPLAY set for test_brozzling.py 2016-12-21 15:15:03 -08:00
Noah Levitt
72816d1058 don't check robots.txt when scheduling a new site to be crawled, but mark the seed Page as needs_robots_check, and delegate the robots check to brozzler-worker; new test of robots.txt adherence 2016-11-16 12:23:59 -08:00
Noah Levitt
398871d46b give vagrant vm enough memory so that tests pass consistently 2016-11-14 18:26:00 -08:00
Noah Levitt
5ac8994a24 rename webconsole to dashboard 2016-11-04 17:46:23 -07:00
Noah Levitt
5a373466a3 some vagrant/ansible fixes 2016-10-14 13:47:54 -07:00
Noah Levitt
3627209be1 move ansible directory to top level; generalize formerly vagrant-specific ansible configuration; let upstart manage logging with "console log" 2016-10-13 17:21:55 -07:00
Noah Levitt
15633be612 finish vagrant-brozzler-new-job.py 2016-10-03 18:17:35 -07:00
Noah Levitt
8c9a9c5666 starting on documenting job configuration 2016-09-29 12:03:16 -07:00
Noah Levitt
2462efc4ed replace vagrant-brozzler-new-site with python script that fills in default options and passes through others 2016-09-22 01:47:23 +01:00
Noah Levitt
cc9517cb45 add missing rethinkdb config file to ansible config 2016-09-22 01:45:28 +01:00
Noah Levitt
253122d061 new script runs brozzler-new-site queues a new site to brozzle on the vagrant brozzler deployment 2016-09-16 16:35:44 -07:00
Noah Levitt
38af0f347b working on including pywb in vagrant environment (not finished) 2016-09-14 17:08:00 -07:00
Noah Levitt
c864499a64 starting to create a framework for testing 2016-09-14 17:06:49 -07:00
Noah Levitt
c9bc9fb67d for vagrant, static ansible inventory file, add brozzler-webconsole 2016-08-10 18:41:23 -07:00
Noah Levitt
b62d5a6350 install flash plugin for chromium 2016-07-13 15:23:50 -05:00
Noah Levitt
0b9ce94226 in vagrant/ansible, install brozzler from this checkout instead of from github master 2016-07-01 15:45:39 -05:00
Noah Levitt
ad502f33da remove accidentally committed playbook.retry 2016-06-30 17:56:56 -05:00
Noah Levitt
2aef00826b vagrant setup (unfinished) 2016-06-30 17:50:11 -05:00