Noah Levitt
ce03381b92
move _find_available_ports to chrome.py, changing the way it works so that browser:9200 doesn't get stuck at 9201 forever, which pushes 9201 to 9202 etc, and add a unit test
2016-12-06 17:12:20 -08:00
Noah Levitt
74009852d6
split Chrome class into its own module
2016-12-06 12:50:38 -08:00
Noah Levitt
3c43fdaced
new utility brozzler-list-captures for looking up entries in the "captures" table
2016-11-30 00:52:14 +00:00
Noah Levitt
9567c088c8
in warcprox 2.0b2, captures table field has been renamed to "record_length"
2016-11-21 16:21:21 -08:00
Noah Levitt
55c9ae07b7
remove flickr behavior, flickr is better off with the default behavior for now
2016-11-16 17:16:48 -08:00
Noah Levitt
899ee8a8dd
Update README.rst
2016-11-16 12:26:50 -08:00
Noah Levitt
6bb9d68dce
add travis-ci badge
2016-11-16 12:26:33 -08:00
Noah Levitt
72816d1058
don't check robots.txt when scheduling a new site to be crawled, but mark the seed Page as needs_robots_check, and delegate the robots check to brozzler-worker; new test of robots.txt adherence
2016-11-16 12:23:59 -08:00
Noah Levitt
24cc8377fb
robots.txt for testing
2016-11-16 12:12:17 -08:00
Noah Levitt
3aead6de93
monkey-patch reppy to support substring user-agent matching
2016-11-16 11:41:34 -08:00
Noah Levitt
398871d46b
give vagrant vm enough memory so that tests pass consistently
2016-11-14 18:26:00 -08:00
Noah Levitt
2b0a47c914
Merge pull request #27 from internetarchive/i2
...
update Instagram behavior, mostly css selectors
2016-11-14 12:40:55 -08:00
Noah Levitt
a74247412c
need warcprox to listen on public address because that's what it puts in the service registry
2016-11-14 10:03:40 -08:00
Noah Levitt
c9b45a7e76
looks like the problem may have been a bug in ansible 2.2.0.0, so pin to 2.1.3.0
2016-11-14 09:58:13 -08:00
Barbara Miller
12a054e6dc
update behavior, mostly css selectors
2016-11-14 09:20:40 -08:00
Noah Levitt
28b010a2ba
back to dev version number
2016-11-11 14:58:55 -08:00
Noah Levitt
7aca046905
1.1b7
1.1b7
2016-11-11 14:58:07 -08:00
Noah Levitt
26b571219b
use \n to delimit outlinks because urls can contain spaces (and anything else except [\n\t\0]) in the fragment part even after browser canonicalization
2016-11-11 14:14:47 -08:00
Noah Levitt
02bf23059e
pass behavior_parameters from job configuration into Site objects
2016-11-09 13:43:10 -08:00
Noah Levitt
8e115b44fa
add --behavior-parameters argument to brozzler-new-site
2016-11-09 13:12:36 -08:00
Noah Levitt
953e50d9a6
fix bug in final_bounces (not sure what I was thinking)
2016-11-09 13:12:14 -08:00
Noah Levitt
8889e4ab20
restore accidentally removed functionality handling page redirects and friends
2016-11-08 18:17:48 -08:00
Noah Levitt
054cb255ac
cat logs on travis-ci failure
2016-11-08 14:26:12 -08:00
Noah Levitt
125a31165a
reppy 0.4.1 has a significantly different api apparently, so for now let's go back to 0.3.4
2016-11-08 14:11:46 -08:00
Noah Levitt
fe18d915f5
still trying to get installation of pip to work on travis-ci
2016-11-08 13:50:12 -08:00
Noah Levitt
f10b4c71e6
update for reppy api change and pin to current version of reppy
2016-11-08 13:39:32 -08:00
Noah Levitt
cba5fa4a0b
tweaks to ansible config to try to get the deployment to run on travis-ci
2016-11-08 13:31:52 -08:00
Noah Levitt
9d66f294ec
move behavior_parameters into top level of site configuration
2016-11-07 18:16:04 -08:00
Noah Levitt
185d65bd5b
Merge remote-tracking branch 'galgeek/login'
...
* galgeek/login:
add login details to behavior parameters
initial login additions
2016-11-07 18:15:43 -08:00
Noah Levitt
abca90a128
install the virtualenv package with pip because the apt version is old and conflicts with the recent version of pip we're using
2016-11-07 17:51:43 -08:00
Noah Levitt
99feeab581
logging tweak
2016-11-04 17:53:02 -07:00
Noah Levitt
5ac8994a24
rename webconsole to dashboard
2016-11-04 17:46:23 -07:00
Barbara Miller
c670bd060e
add login details to behavior parameters
2016-11-02 16:51:19 -07:00
Barbara Miller
6c7f88c171
initial login additions
2016-11-02 16:04:18 -07:00
Noah Levitt
fef7d6a9fa
Merge pull request #25 from ato/mouseovers-behavior
...
Add a mouseovers behavior
2016-10-31 11:55:46 -07:00
Noah Levitt
0c8ea52b08
Merge pull request #26 from ato/flash-doc
...
Update Flash plugin instructions
2016-10-31 11:55:02 -07:00
Alex Osborne
a1591a169a
Update Flash plugin instructions
...
libpepflashplayer.so is no longer included in the Chrome release
package. Adobe are resuming Linux releases and the plugin is
now available from their download site.
2016-10-29 14:12:33 +11:00
Noah Levitt
5bd4908e1d
punycode host part of url to avoid errors doing WARCPROX_WRITE_RECORD
2016-10-26 13:50:23 -07:00
Noah Levitt
f30c143c66
avoid exception in case of url without host part
2016-10-26 12:45:24 -07:00
Noah Levitt
332912acd7
apparently response.status doesn't work sometimes; response.getcode() is documented so hopefully it keeps working
2016-10-25 17:50:49 -07:00
Noah Levitt
70ce642bee
integer job ids are permitted as well as string
2016-10-21 21:25:16 +00:00
Alex Osborne
872030d716
Add a mouseovers behavior based on simpleclicks
2016-10-21 06:46:43 +11:00
Noah Levitt
21891476c4
avoid use of __double_underscore member variables because they're special https://shahriar.svbtle.com/underscores-in-python
2016-10-18 18:57:11 -07:00
Noah Levitt
becd832ea3
bump version after merging accept-encoding pull request
2016-10-18 17:55:00 -07:00
Noah Levitt
a146ba52ae
Merge pull request #20 from internetarchive/rethinkInDoc
...
rethinkdb installer works for me
2016-10-18 17:54:48 -07:00
Noah Levitt
eedd2071bb
Merge pull request #21 from galgeek/encodingFixIdentity
...
ARI-5008 / ARI-5065 ExtraHTTPHeaders Accept-Encoding fix Identity
2016-10-18 17:53:50 -07:00
Noah Levitt
aae34452f5
bump version number after merging travis-ci pull request
2016-10-18 17:48:45 -07:00
Noah Levitt
1490a40c28
Merge pull request #23 from internetarchive/travis
...
Travis
2016-10-18 17:48:04 -07:00
Noah Levitt
68a32fcbe2
bump version number after mouse's pull request
2016-10-18 17:45:55 -07:00
Noah Levitt
37950e6dfd
Merge pull request #24 from mouse-reeve/select-proxy
...
Use warcprox if enable_warcprox_features is true
2016-10-18 17:41:56 -07:00