Noah Levitt
|
e6eeca6ae2
|
handle 420 Reached limit when fetching robots in brozzler-hq
|
2015-08-01 17:54:29 +00:00 |
|
Noah Levitt
|
511e19ff4d
|
handle 420 "Limit reached" when browser receives it
|
2015-08-01 01:26:59 +00:00 |
|
Noah Levitt
|
7b98af7d9f
|
handle reached limit response from warcprox
|
2015-08-01 00:09:57 +00:00 |
|
Noah Levitt
|
88f352efea
|
use new fork of youtube-dl with support for extra http headers on every request
|
2015-07-21 19:23:01 +00:00 |
|
Noah Levitt
|
a54e60dbaf
|
change terminology CrawlUrl => Page since that better represents what it means in brozzler and differentiates from heritrix
|
2015-07-16 18:39:29 -07:00 |
|
Noah Levitt
|
f2bc7ec271
|
refactor brozzler.hq.Site and brozzler.url.CrawlUrl into new brozzler.site package; fix bugs in robots.txt handling
|
2015-07-15 18:03:03 -07:00 |
|
Noah Levitt
|
4cfb287397
|
refactor hq code into hq module
|
2015-07-15 14:26:48 -07:00 |
|
Noah Levitt
|
fd0c3322ee
|
update readme, s/umbra/brozzler/ in most places, delete non-brozzler stuff
|
2015-07-13 17:09:39 -07:00 |
|