928 Commits

Author SHA1 Message Date
Noah Levitt
69a25bc74a equivalent functionality using angular and restful json 2015-09-25 19:15:20 +00:00
Noah Levitt
1ca17f204b brozzler web console initial fiddling 2015-09-25 17:59:38 +00:00
Hunter
c08128dfe2 Merge pull request #44 from nlevitt/AITFIVE-451-3
followup to https://github.com/internetarchive/umbra/pull/43
2015-09-24 15:44:11 -07:00
Noah Levitt
a17b0f3b8d refactor umbraAboveBelowOrOnScreen into umbraBehavior object 2015-09-24 12:34:55 -07:00
Noah Levitt
f2ead0570e fixes for psu24 behavior 2015-09-24 12:20:19 -07:00
Noah Levitt
dff4149185 missed one more use of brozzler.version 2015-09-24 00:44:35 +00:00
Noah Levitt
a94dfd27f8 oops, set brozzler.__version__ 2015-09-24 00:34:51 +00:00
Noah Levitt
8c69ca3b39 giving up on using git revision in version number :( latest issue is when installing a package that calls git to compute a version number, but cwd is some other git project, you get the wrong thing 2015-09-24 00:17:33 +00:00
Noah Levitt
9699a40645 remove "dev" from version number and switch README to rst 2015-09-23 22:35:26 +00:00
Noah Levitt
245078284d pep440 compliant versioning 2015-09-23 14:46:57 -07:00
Noah Levitt
40522ef5a5 fix some rethinkdb related stuff; most notably r.desc() and related stuff don't currently work correctly if r is a Rethinker, so use rethinkdb directly in that case 2015-09-23 01:53:05 +00:00
Noah Levitt
8bf34d9db6 tweaks 2015-09-23 00:50:38 +00:00
Noah Levitt
2bc66f52d4 new rethinkstuff.Rethinker api 2015-09-23 00:50:15 +00:00
Noah Levitt
2863b7e422 goodbye requirements.txt now that we have devpi 2015-09-23 00:49:20 +00:00
Hunter Stern
f8a70f3842 More changes. 2015-09-17 16:24:41 -07:00
Hunter Stern
8829323a38 Remove changes for https://webarchive.jira.com/browse/ARI-4518: 2015-09-17 09:07:03 -07:00
Hunter Stern
f282213981 Add fix for https://webarchive.jira.com/browse/ARI-4518 2015-09-17 08:43:30 -07:00
Noah Levitt
c780b147b3 missed "git+" 2015-09-16 19:24:48 +00:00
Noah Levitt
c682627aec Rethinker moved to pyrethink library 2015-09-16 19:24:17 +00:00
Noah Levitt
a8f9664212 separate virtualenvs 2015-09-16 19:23:11 +00:00
Hunter Stern
5ccc535f51 More changes 2015-09-16 09:23:13 -07:00
Hunter Stern
3467670900 More changes for handling psu24 site 2015-09-15 18:03:08 -07:00
Noah Levitt
5a6cbf01da Dockerfile for brozzler worker 2015-09-15 23:02:37 +00:00
Hunter Stern
ea41653c44 Pulled in changes from https://github.com/nlevitt/umbra/tree/aitfive-451-alt 2015-09-15 11:53:53 -07:00
Noah Levitt
70308c10f4 shouldn't have local paths as requirements 2015-09-15 18:07:47 +00:00
Noah Levitt
b30cc2d68b simpler implementation for https://github.com/internetarchive/umbra/pull/42/files 2015-09-14 17:57:01 -07:00
Noah Levitt
dc9d1a4959 detecting job finish seems to be working now 2015-09-10 01:38:31 +00:00
Noah Levitt
92a288bc35 detect jobs finishing! (not well tested yet) 2015-09-09 22:11:48 +00:00
Noah Levitt
72e72e03c4 brozzler-job-starter.py -> ait-brozzler-boss.py 2015-09-09 22:11:14 +00:00
Noah Levitt
1b94d10723 on reset, mark active jobs as finished 2015-09-08 22:38:39 +00:00
Noah Levitt
290ea433a5 save full size screenshot as jpeg too 2015-09-08 22:37:35 +00:00
Noah Levitt
9698b0f847 create thumbnail of screenshot and send to warcprox 2015-09-07 06:27:21 +00:00
Noah Levitt
565ab5f936 save screenshots with new scheme url screenshot:..., WARC-Type:resource 2015-09-07 00:26:37 +00:00
Noah Levitt
993ae6a833 run ait5 partner webapp; consolidate "status" and "fullstatus" 2015-09-04 21:02:33 +00:00
Noah Levitt
5fe2805285 fix bug claiming site, looks like there could be a race condition with other worker claiming the same site 2015-09-04 01:36:29 +00:00
Noah Levitt
3c23aa8fd4 finally, the jobs table 2015-09-03 01:05:03 +00:00
Noah Levitt
6cda4739b8 log exception when thread dies (seems to be dying silently sometimes) 2015-09-03 01:04:41 +00:00
Noah Levitt
839bf6f4ae script to help with starting/restarting/etc in my dev environment 2015-09-03 01:03:19 +00:00
Noah Levitt
f334107b47 support for specifying rethinkdb database name; wrap rethinkdb operations and retry if appropriate (as best as we can tell) 2015-08-28 00:37:26 +00:00
Noah Levitt
cf91fb1377 Revert "use dependency_links instead of requirements.txt in spite of ugliness of --process-dependency-links, #egg=..., so that dependent projects can use brozzler more easily"
Ugh.. too much pain, not worth the time to figure out the magic #egg=
incantation.

This reverts commit 78ca0701651c35bda69122ddf652cbb8d95daeb0.
2015-08-26 19:44:04 +00:00
Noah Levitt
78ca070165 use dependency_links instead of requirements.txt in spite of ugliness of --process-dependency-links, #egg=..., so that dependent projects can use brozzler more easily 2015-08-26 19:22:59 +00:00
Noah Levitt
efa640c640 refactor to simplify starting new job from code 2015-08-25 19:52:33 +00:00
Noah Levitt
68de85022a there is no hq anymore; database notes can still be found in git history, though there's nothing about rethinkdb 2015-08-21 17:55:29 +00:00
Noah Levitt
231d019659 use nlevitt fork of surt library for less stupid handling of mailto: urls, etc 2015-08-20 21:23:59 +00:00
Noah Levitt
ee50818dca if database already exists but tables don't, just create them 2015-08-20 21:23:08 +00:00
Noah Levitt
3af1e10e13 make it work again, and list discovered outlinks 2015-08-20 21:22:08 +00:00
Noah Levitt
8b45d7eb69 since I can't figure out what's causing these sporadic errors fetching certain robots.txt through warcprox, stick a retry loop around the fetch 2015-08-19 22:50:04 +00:00
Noah Levitt
ad543e6134 enforce time limits; move scope_and_schedule_outlinks into frontier.py; fix bugs around scoping on seed redirect 2015-08-19 20:16:25 +00:00
Noah Levitt
ddce1cdc71 fix mistakenly removed import; try to shut down chrome in case of unexpected exception 2015-08-19 20:04:46 +00:00
Noah Levitt
2533229fa1 add __all__ to modules 2015-08-19 19:01:28 +00:00