630 Commits

Author SHA1 Message Date
Noah Levitt
664bb33add tweaks that have been sitting here 2016-02-10 00:38:48 +00:00
Noah Levitt
887eadb99a lock down vnc 2016-02-10 00:37:36 +00:00
Hunter Stern
fe650b69ed Handle Python to JS boolean conversion 2016-02-09 10:48:33 -08:00
Hunter Stern
2ed96f9b59 Allow clicking on already clicked element to continue in behaviors if click_until_hard_timeout is set to true 2016-02-05 10:00:24 -08:00
Neil Minton
b9973c7cae Merge pull request #51 from vonrosen/ARI-4690
Make Umbra click on 'Load More' button for youtube pages
2016-02-03 14:07:51 -08:00
Hunter Stern
fe81aa4ff2 Make Umbra click on 'Load More' button for youtube pages 2016-01-28 11:53:59 -08:00
Neil Minton
54d92f88b0 Merge pull request #49 from nlevitt/work-dir-cleanup-exception
catch and log exception deleting temporary work directory
2015-12-18 11:34:54 -08:00
Noah Levitt
f1770b813d Merge pull request #48 from sfdevguy/master
Add custom behavior for Brooklyn Museum
2015-12-18 11:34:00 -08:00
Noah Levitt
8ab0857dad catch and log exception deleting temporary work directory 2015-12-18 11:26:36 -08:00
Neil Minton
c494afb749 Merge branch 'AITFIVE-497' 2015-12-02 10:02:05 -08:00
Noah Levitt
36e2bb2729 use rethinkdb native time type for date/time values 2015-11-18 02:07:27 +00:00
Noah Levitt
ca0053e3be also when adding new job, insert all sites before the job, to prevent brozzler workers thinking the job is finished before all the sites are in the db 2015-11-14 03:10:58 +00:00
Noah Levitt
3260fe4e9e when adding new job, insert the seed url Page document into the database before the Site, to avoid situation where brozzler worker claims the site, finds no pages to crawl, and decides the site is finished 2015-11-13 23:47:51 +00:00
Noah Levitt
21906f8cad vnc-websock.sh uses bashisms 2015-11-12 02:59:45 +00:00
Noah Levitt
3bcd2400f7 2 instances of warcprox; no docker for brozzler worker 2015-11-12 02:59:21 +00:00
Noah Levitt
4c2ecab856 surt==0.3b2 (available on pypi) 2015-11-12 02:58:53 +00:00
Noah Levitt
38dec97e19 logging tweaks 2015-11-12 02:58:26 +00:00
Noah Levitt
5597b4cf1a quiet down requests.packages.urllib3 2015-11-12 02:58:00 +00:00
Noah Levitt
998c3975b2 replace jobs page with home page which also lists services 2015-11-12 02:57:27 +00:00
Noah Levitt
343b5c0f82 register with service registry; only start chrome right before using it, so that web console vnc windows aren't always full of about:blank 2015-11-12 02:56:27 +00:00
Noah Levitt
b91d7e4c3f startup scripts for services needed for non-docker deployment 2015-11-11 21:28:55 +00:00
Noah Levitt
29b6a0b0d4 Merge branch 'master' of github.com:nlevitt/brozzler
* 'master' of github.com:nlevitt/brozzler:
  update detection of modal close button for facebook changes
  refactor umbraAboveBelowOrOnScreen into umbraBehavior object
  fixes for psu24 behavior
  More changes.
  Remove changes for https://webarchive.jira.com/browse/ARI-4518:
  Add fix for https://webarchive.jira.com/browse/ARI-4518
  More changes
  More changes for handling psu24 site
  Pulled in changes from https://github.com/nlevitt/umbra/tree/aitfive-451-alt
  simpler implementation for https://github.com/internetarchive/umbra/pull/42/files
  Adds routing_key to queue Queue creation
2015-11-05 20:10:22 +00:00
Noah Levitt
8c422534a5 smart waiting for tables and indexes to be ready 2015-11-05 20:10:14 +00:00
Hunter
b329d193ca Merge pull request #46 from nlevitt/facebook-modal-close
update detection of modal close button for facebook changes
2015-11-04 07:37:36 -08:00
Noah Levitt
f6f4daf24a update detection of modal close button for facebook changes 2015-11-03 15:36:31 -08:00
Noah Levitt
8889707f24 update detection of modal close button for facebook changes 2015-11-03 15:33:46 -08:00
Noah Levitt
85d87a5e42 Merge remote-tracking branch 'umbra/master'
* umbra/master:
  refactor umbraAboveBelowOrOnScreen into umbraBehavior object
  fixes for psu24 behavior
  More changes.
  Remove changes for https://webarchive.jira.com/browse/ARI-4518:
  Add fix for https://webarchive.jira.com/browse/ARI-4518
  More changes
  More changes for handling psu24 site
  Pulled in changes from https://github.com/nlevitt/umbra/tree/aitfive-451-alt
  simpler implementation for https://github.com/internetarchive/umbra/pull/42/files
  Adds routing_key to queue Queue creation
2015-11-03 15:31:38 -08:00
Neil Minton
dceef1a676 Add custom behavior for Brooklyn Museum. 2015-11-03 13:59:20 -08:00
Noah Levitt
90fad87f7e websockify startup script 2015-11-03 20:15:41 +00:00
Noah Levitt
03e7c29701 switch noVNC git url to https 2015-10-29 21:36:43 +00:00
Noah Levitt
d9d69a88fd tweaking workers page 2015-10-29 01:01:28 +00:00
Noah Levitt
7b39ba021b proof of concept presenting workers in web console with novnc 2015-10-27 19:01:21 +00:00
Noah Levitt
a0f4fd449c Merge pull request #1 from adam-miller/fixes
uncommented init imports, removed required job_id in Frontier.finished
2015-10-22 15:33:46 -07:00
Adam Miller
20bde1c482 uncommented init imports, removed required job_id in Frontier.finished 2015-10-22 22:29:24 +00:00
Noah Levitt
d1aebb0258 fix indentation 2015-10-14 00:44:29 +00:00
Noah Levitt
80f963591f mount warcs dir with sshfs; start-dead to start only services that aren't already running 2015-10-12 23:10:13 +00:00
Noah Levitt
196e52ac0a homegrown infinite scroll through pages on site page 2015-10-12 23:08:35 +00:00
Noah Levitt
3df4a3e109 make the site page present something sensible 2015-10-10 00:30:03 +00:00
Noah Levitt
549b149e39 Merge branch 'master' of github.com:nlevitt/brozzler
* 'master' of github.com:nlevitt/brozzler:
  logo
2015-10-09 20:31:15 +00:00
Noah Levitt
9ed1ac817e 4 space indent everywhere 2015-10-09 20:31:07 +00:00
Noah Levitt
0591548861 more incremental progress on web console 2015-10-09 20:12:40 +00:00
Noah Levitt
0050fe56b8 logo 2015-10-07 17:53:16 -07:00
Noah Levitt
2ddda68392 symlink to root 2015-10-08 00:37:39 +00:00
Noah Levitt
d1158ab224 incremental progress on web console 2015-10-08 00:33:49 +00:00
Noah Levitt
7ab2eb4fda brozzler web console in the mix 2015-10-08 00:31:28 +00:00
Noah Levitt
82011c15cd Merge branch 'master' of github.com:nlevitt/brozzler
* 'master' of github.com:nlevitt/brozzler:
  logo!?
2015-10-07 23:56:44 +00:00
Noah Levitt
3805c7bf93 logo!? 2015-10-07 15:45:01 -07:00
Noah Levitt
a5eb223b32 run brozzler workers inside docker containers 2015-10-06 01:24:01 +00:00
Noah Levitt
5868192e0a more stubby stuff 2015-09-28 22:05:43 +00:00
Noah Levitt
2e1601ac81 i think hash-less urls are working 2015-09-25 22:48:01 +00:00