Hunter Stern
|
0edef7be6b
|
Merge remote-tracking branch 'internetarchive/master' into ari-3774
|
2014-09-22 14:12:59 -07:00 |
|
Adam Miller
|
916f1b990e
|
Cleanup instagram timeout and state handling
|
2014-09-17 16:26:53 -07:00 |
|
Adam Miller
|
eb3ea95b87
|
Cleanup timeout logic
|
2014-09-17 15:26:13 -07:00 |
|
Adam Miller
|
5a3c8e9a05
|
ARI-4016 - Support: embedded videos on marquette.edu
|
2014-09-15 11:06:33 -07:00 |
|
Hunter Stern
|
a2ea2501db
|
More soundcloud changes.
|
2014-09-12 16:07:32 -07:00 |
|
Hunter Stern
|
e320654d1e
|
Allow selector to detect https and http soundcloud widget.
|
2014-09-12 09:56:41 -07:00 |
|
Adam Miller
|
7afdd7b50b
|
Added behavior for instagram to scroll past two pages, and click to enlarge images.
|
2014-09-02 17:02:30 -07:00 |
|
Noah Levitt
|
9052fd8569
|
add license section
|
2014-09-02 16:11:49 -07:00 |
|
Noah Levitt
|
51d6b1a4e2
|
apache license
|
2014-09-02 16:10:00 -07:00 |
|
Hunter Stern
|
eb8c9faf89
|
Merge remote-tracking branch 'internetarchive/master'
|
2014-08-28 10:56:27 -07:00 |
|
Adam Miller
|
ce2957269f
|
Merge pull request #31 from nlevitt/drain-republish
new utility queue-json, and another change to help with draining from and republishing to amqp
|
2014-08-26 16:58:21 -07:00 |
|
Noah Levitt
|
2ab767eaa9
|
make drain-queue output actual json instead of python dict syntax
|
2014-08-26 23:46:00 +00:00 |
|
Noah Levitt
|
fe1d9e01eb
|
utility queue-json to publish an arbitrary json blob to amqp
|
2014-08-26 23:45:42 +00:00 |
|
Hunter Stern
|
0e7fd93967
|
Merge remote-tracking branch 'internetarchive/master' into ari-3774
|
2014-08-26 15:12:13 -07:00 |
|
vonrosen
|
bbba344886
|
Merge pull request #29 from nlevitt/handle-bad-message
reject (discard) bad messages
|
2014-08-20 08:21:29 -07:00 |
|
Noah Levitt
|
c886b57d3a
|
reject (discard) bad messages
|
2014-08-19 18:51:43 -07:00 |
|
Hunter Stern
|
b110a57938
|
Merge remote-tracking branch 'internetarchive/master'
|
2014-08-14 15:26:15 -07:00 |
|
Noah Levitt
|
9d90b5830a
|
facebook - scroll all the to the bottom before scrolling back up to click more stuff
|
2014-08-01 16:53:13 -07:00 |
|
Noah Levitt
|
dd9ef50484
|
suppress logging of umbraBehaviorFinished() message which is sent a lot
|
2014-08-01 16:22:45 -07:00 |
|
Hunter Stern
|
6a5d1e2266
|
Disable web security in chromium so iframes on different domains can be accessed by behavior javascript.
|
2014-07-24 16:46:06 -07:00 |
|
Hunter Stern
|
80f3a4a067
|
Enhancement to allow embedded soundcloud audio files to be detected
|
2014-07-24 16:44:05 -07:00 |
|
Hunter Stern
|
e7e82aa913
|
Merge branch 'master' of github.com:vonrosen/umbra into vonrosenmaster
|
2014-07-23 11:09:27 -07:00 |
|
Adam Miller
|
8e44e18053
|
Merge pull request #26 from nlevitt/dev
stability!
|
2014-07-21 13:18:24 -07:00 |
|
Noah Levitt
|
ae838af25d
|
set amqp prefetch count to the number of urls we can handle at a time, i.e. max_active_browsers (with prefetch=1 umbra was only browsing one url at a time, after quickly burning through urls already on the queue when started)
|
2014-07-02 10:30:51 -07:00 |
|
Noah Levitt
|
6306c16698
|
kill -HUP to immediately close and reopen amqp consumer connection
|
2014-06-23 17:18:27 -07:00 |
|
Noah Levitt
|
02c054c284
|
do not wait forever for zombie websocket threads (this change should also reveal how we get these sometimes)
|
2014-06-20 18:13:45 -07:00 |
|
Noah Levitt
|
9b32f9a3d1
|
ugh, it was better with the default width, in spite of the ridiculous behavior.script
|
2014-06-20 14:40:12 -07:00 |
|
Noah Levitt
|
2cf69bdaff
|
seriously, don't try to wrap any lines, pprint
|
2014-06-20 14:37:33 -07:00 |
|
Noah Levitt
|
c6fa00812c
|
when dumping state on SIGQUIT, build the whole string before printing to avoid stuff getting intermingled with other logging and stuff
|
2014-06-20 14:33:01 -07:00 |
|
Noah Levitt
|
ead46d5716
|
more elaborate dumping of state on SIGQUIT to replace faulthandler
|
2014-06-20 14:05:33 -07:00 |
|
Noah Levitt
|
ebb14ff889
|
get rid of chrome_wait straggler
|
2014-06-18 17:31:28 -07:00 |
|
Noah Levitt
|
17ef9d9f28
|
close and reopen the amqp consumer connection only every 2.5 hours instead of every 15 minutes, because now that we have to wait for all browsers to close when we do the reconnection, it slows us down a lot
|
2014-06-18 14:58:44 -07:00 |
|
Noah Levitt
|
025db91dea
|
get rid of --browser-wait and --routing-key in favor of sensible defaults, some other tweaks
|
2014-06-11 10:58:08 -07:00 |
|
Noah Levitt
|
a78e60f1da
|
wait for a browser to become available and start it up before reading the next url from amqp; ack the message only after completing the browsing process successfully, and requeue if it's not successful; some refactoring to make the timing work for this
|
2014-06-09 13:15:05 -07:00 |
|
Noah Levitt
|
e3c23a0f2b
|
Merge pull request #25 from vonrosen/ari-3724
Allow flash requests to be detected. For https://webarchive.jira.com/browse/ARI-3724
|
2014-06-06 15:15:24 -07:00 |
|
vonrosen
|
d40b542ffe
|
Merge pull request #1 from vonrosen/ari-3724
Allow flash requests to be detected.
|
2014-06-06 10:51:09 -07:00 |
|
Hunter Stern
|
41270af223
|
Allow flash requests to be detected.
|
2014-06-06 10:47:29 -07:00 |
|
vonrosen
|
e8456e0a62
|
Merge pull request #24 from nlevitt/dev
more improvements, mostly for robustness
|
2014-06-05 12:00:37 -07:00 |
|
Noah Levitt
|
dd2d36328f
|
scroll up faster on facebook
|
2014-06-04 12:34:20 -07:00 |
|
Noah Levitt
|
c2153be288
|
start behaviors again on any Page.loadEventFired, because if we don't do that, we keep asking the page if the behavior thinks it's finished, and it doesn't know what we're talking about
|
2014-06-03 18:06:02 -07:00 |
|
Noah Levitt
|
bfb6cac25f
|
use temp dir as $HOME instead of just chromium user-data-dir, because sometimes we have been seeing chrome print this error message and hang "[1975:2001:0603/215855:ERROR:nss_util.cc(444)] Error initializing NSS with a persistent database (sql:/home/archiveit/.pki/nssdb): NSS error code: -8187"
|
2014-06-03 16:02:00 -07:00 |
|
Noah Levitt
|
e619e013b6
|
sleep for 5 seconds after starting a browser, since starting 20 at once brings the computer to its knees
|
2014-06-03 15:57:12 -07:00 |
|
Noah Levitt
|
1f91018d91
|
even more patience killing chrome, send another sigterms every ten seconds if chrome is still alive
|
2014-06-02 12:09:15 -07:00 |
|
Noah Levitt
|
c6bd2417d7
|
good smarter killing of chrome
|
2014-06-02 11:58:11 -07:00 |
|
Noah Levitt
|
1ae9b83dab
|
Merge branch 'dev' of github.com:nlevitt/umbra into dev
|
2014-05-30 23:07:54 -07:00 |
|
Noah Levitt
|
56a721f059
|
dump stack trace and don't return browser to pool on critical error where chrome process might still be running
|
2014-05-30 23:07:39 -07:00 |
|
Noah Levitt
|
b2e27b99d2
|
nice log message when fully shut down
|
2014-05-30 17:32:01 -07:00 |
|
Noah Levitt
|
c9d503e690
|
log version number at startup
|
2014-05-30 15:00:01 -07:00 |
|
Noah Levitt
|
ed92f3bd53
|
for the version string, use abbreviated commit hash instead of attempting to use the branch name
|
2014-05-29 23:33:14 -07:00 |
|
Noah Levitt
|
bef57e2819
|
for version string, try to handle case where head is detached
|
2014-05-29 20:57:33 -07:00 |
|