Noah Levitt
|
d3063fbd2b
|
move cookie db management code into chrome.py
|
2016-12-06 18:04:51 -08:00 |
|
Noah Levitt
|
ce03381b92
|
move _find_available_ports to chrome.py, changing the way it works so that browser:9200 doesn't get stuck at 9201 forever, which pushes 9201 to 9202 etc, and add a unit test
|
2016-12-06 17:12:20 -08:00 |
|
Noah Levitt
|
74009852d6
|
split Chrome class into its own module
|
2016-12-06 12:50:38 -08:00 |
|
Noah Levitt
|
3c43fdaced
|
new utility brozzler-list-captures for looking up entries in the "captures" table
|
2016-11-30 00:52:14 +00:00 |
|
Noah Levitt
|
9567c088c8
|
in warcprox 2.0b2, captures table field has been renamed to "record_length"
|
2016-11-21 16:21:21 -08:00 |
|
Noah Levitt
|
55c9ae07b7
|
remove flickr behavior, flickr is better off with the default behavior for now
|
2016-11-16 17:16:48 -08:00 |
|
Noah Levitt
|
72816d1058
|
don't check robots.txt when scheduling a new site to be crawled, but mark the seed Page as needs_robots_check, and delegate the robots check to brozzler-worker; new test of robots.txt adherence
|
2016-11-16 12:23:59 -08:00 |
|
Noah Levitt
|
3aead6de93
|
monkey-patch reppy to support substring user-agent matching
|
2016-11-16 11:41:34 -08:00 |
|
Noah Levitt
|
398871d46b
|
give vagrant vm enough memory so that tests pass consistently
|
2016-11-14 18:26:00 -08:00 |
|
Noah Levitt
|
a74247412c
|
need warcprox to listen on public address because that's what it puts in the service registry
|
2016-11-14 10:03:40 -08:00 |
|
Noah Levitt
|
28b010a2ba
|
back to dev version number
|
2016-11-11 14:58:55 -08:00 |
|
Noah Levitt
|
7aca046905
|
1.1b7
|
2016-11-11 14:58:07 -08:00 |
|
Noah Levitt
|
26b571219b
|
use \n to delimit outlinks because urls can contain spaces (and anything else except [\n\t\0]) in the fragment part even after browser canonicalization
|
2016-11-11 14:14:47 -08:00 |
|
Noah Levitt
|
02bf23059e
|
pass behavior_parameters from job configuration into Site objects
|
2016-11-09 13:43:10 -08:00 |
|
Noah Levitt
|
8e115b44fa
|
add --behavior-parameters argument to brozzler-new-site
|
2016-11-09 13:12:36 -08:00 |
|
Noah Levitt
|
953e50d9a6
|
fix bug in final_bounces (not sure what I was thinking)
|
2016-11-09 13:12:14 -08:00 |
|
Noah Levitt
|
054cb255ac
|
cat logs on travis-ci failure
|
2016-11-08 14:26:12 -08:00 |
|
Noah Levitt
|
125a31165a
|
reppy 0.4.1 has a significantly different api apparently, so for now let's go back to 0.3.4
|
2016-11-08 14:11:46 -08:00 |
|
Noah Levitt
|
fe18d915f5
|
still trying to get installation of pip to work on travis-ci
|
2016-11-08 13:50:12 -08:00 |
|
Noah Levitt
|
f10b4c71e6
|
update for reppy api change and pin to current version of reppy
|
2016-11-08 13:39:32 -08:00 |
|
Noah Levitt
|
cba5fa4a0b
|
tweaks to ansible config to try to get the deployment to run on travis-ci
|
2016-11-08 13:31:52 -08:00 |
|
Noah Levitt
|
9d66f294ec
|
move behavior_parameters into top level of site configuration
|
2016-11-07 18:16:04 -08:00 |
|
Noah Levitt
|
abca90a128
|
install the virtualenv package with pip because the apt version is old and conflicts with the recent version of pip we're using
|
2016-11-07 17:51:43 -08:00 |
|
Noah Levitt
|
99feeab581
|
logging tweak
|
2016-11-04 17:53:02 -07:00 |
|
Noah Levitt
|
5ac8994a24
|
rename webconsole to dashboard
|
2016-11-04 17:46:23 -07:00 |
|
Noah Levitt
|
5bd4908e1d
|
punycode host part of url to avoid errors doing WARCPROX_WRITE_RECORD
|
2016-10-26 13:50:23 -07:00 |
|
Noah Levitt
|
f30c143c66
|
avoid exception in case of url without host part
|
2016-10-26 12:45:24 -07:00 |
|
Noah Levitt
|
332912acd7
|
apparently response.status doesn't work sometimes; response.getcode() is documented so hopefully it keeps working
|
2016-10-25 17:50:49 -07:00 |
|
Noah Levitt
|
70ce642bee
|
integer job ids are permitted as well as string
|
2016-10-21 21:25:16 +00:00 |
|
Noah Levitt
|
21891476c4
|
avoid use of __double_underscore member variables because they're special https://shahriar.svbtle.com/underscores-in-python
|
2016-10-18 18:57:11 -07:00 |
|
Noah Levitt
|
becd832ea3
|
bump version after merging accept-encoding pull request
|
2016-10-18 17:55:00 -07:00 |
|
Noah Levitt
|
aae34452f5
|
bump version number after merging travis-ci pull request
|
2016-10-18 17:48:45 -07:00 |
|
Noah Levitt
|
68a32fcbe2
|
bump version number after mouse's pull request
|
2016-10-18 17:45:55 -07:00 |
|
Noah Levitt
|
a370e7b987
|
tiny fix, and now the test passes for me
|
2016-10-14 19:21:26 -07:00 |
|
Noah Levitt
|
4044fcb647
|
fix pywb/brozzler replay of revisit records
|
2016-10-14 19:15:23 -07:00 |
|
Noah Levitt
|
27452990ee
|
toward getting initial tests to pass
|
2016-10-14 18:26:48 -07:00 |
|
Noah Levitt
|
5a373466a3
|
some vagrant/ansible fixes
|
2016-10-14 13:47:54 -07:00 |
|
Noah Levitt
|
3627209be1
|
move ansible directory to top level; generalize formerly vagrant-specific ansible configuration; let upstart manage logging with "console log"
|
2016-10-13 17:21:55 -07:00 |
|
Noah Levitt
|
56e651baeb
|
working on basic integration tests
|
2016-10-13 17:12:35 -07:00 |
|
Noah Levitt
|
ed8b937277
|
back to dev version
|
2016-10-13 15:11:57 -07:00 |
|
Noah Levitt
|
456c082875
|
this is 1.1b6
|
2016-10-13 15:10:24 -07:00 |
|
Noah Levitt
|
23b59f8f4e
|
Merge branch 'master' of github.com:internetarchive/brozzler
* 'master' of github.com:internetarchive/brozzler:
Ensure job_schema.yaml is installed by pip
|
2016-10-13 15:00:13 -07:00 |
|
Noah Levitt
|
269512f499
|
make setup.py work with python2, not because the whole project works with python2, but just so it can be installed as a dependency of projects that support both python2 and python3
|
2016-10-13 14:59:31 -07:00 |
|
Noah Levitt
|
d82b40be68
|
fix typo
|
2016-10-06 18:20:41 -07:00 |
|
Alex Osborne
|
0fe2ef9387
|
Ensure job_schema.yaml is installed by pip
Use .yaml file extension for consistency with behaviors.yaml.
|
2016-10-05 15:45:11 +11:00 |
|
Noah Levitt
|
59a15d7f5c
|
bump dev version after merging pull requests
|
2016-10-04 14:43:43 -07:00 |
|
Alex Osborne
|
5ac67fe513
|
Validate job conf against a Cerberus schema
|
2016-10-04 21:19:25 +11:00 |
|
Noah Levitt
|
15633be612
|
finish vagrant-brozzler-new-job.py
|
2016-10-03 18:17:35 -07:00 |
|
Noah Levitt
|
de5c520ad7
|
pin psutil version, too
|
2016-10-03 17:12:00 -07:00 |
|
Noah Levitt
|
f220707aa4
|
pin pillow to version 3.3.0, primarily because we have a wheel in devpi for that version
|
2016-10-03 17:06:58 -07:00 |
|