brozzler/brozzler
2017-03-22 15:53:58 -07:00
..
dashboard rethinkstuff is now "doublethink 2017-03-02 12:48:45 -08:00
js-templates handle errors from extract-outlinks.js, which happens on polyvore.com because it changes the definition of Set 😭 2017-02-22 10:57:11 -08:00
__init__.py three-value "brozzled" parameter for frontier.site_pages(); fix thing where every Site got a list of all the seeds from the job; and some more frontier tests to catch these kinds of things 2017-03-20 17:28:16 -07:00
behaviors.yaml convert mouseovers and simpleclicks to jinja2 2016-12-20 17:34:29 -08:00
browser.py use urlcanon library for canonicalization, surtification, scope match rules 2017-03-15 14:59:51 -07:00
chrome.py let the OS pick an available port, to avoid what appear to be timing issues causing multiple browsers to choose the same port 2017-02-22 12:44:19 -08:00
cli.py fix brozzler-easy so that warcprox features are enabled automatically (feature was already there but broken) 2017-03-22 15:15:07 -07:00
easy.py rethinkstuff is now "doublethink 2017-03-02 12:48:45 -08:00
frontier.py three-value "brozzled" parameter for frontier.site_pages(); fix thing where every Site got a list of all the seeds from the job; and some more frontier tests to catch these kinds of things 2017-03-20 17:28:16 -07:00
job.py three-value "brozzled" parameter for frontier.site_pages(); fix thing where every Site got a list of all the seeds from the job; and some more frontier tests to catch these kinds of things 2017-03-20 17:28:16 -07:00
job_schema.yaml always save outlinks info on rethinkdb page object, get rid of 'remember_outlinks' option, to keep config simple, and because it's not a very expensive thing 2017-03-17 10:04:10 -07:00
pywb.py use urlcanon library for canonicalization, surtification, scope match rules 2017-03-15 14:59:51 -07:00
robots.py monkey-patch reppy to support substring user-agent matching 2016-11-16 11:41:34 -08:00
site.py fix brozzler-easy so that warcprox features are enabled automatically (feature was already there but broken) 2017-03-22 15:15:07 -07:00
worker.py ugh, avoid infinite recursion 2017-03-22 15:53:58 -07:00