Barbara Miller
88076595ba
comment tweak
2018-02-12 10:22:41 -08:00
Barbara Miller
668e85be9e
umbraBehavior for thejewishnews.com
2018-02-08 13:05:18 -08:00
Noah Levitt
057284c2a7
Merge pull request #88 from nlevitt/block-urls
...
Block google analytics URLs using new Network.setBlockedURLs API
2018-02-06 16:42:24 -08:00
Noah Levitt
791f77d8a6
add note to readme about browser version
2018-02-06 16:00:14 -08:00
Noah Levitt
506ab0ccc2
check browser version at startup
2018-02-06 15:56:50 -08:00
Vangelis Banos
3b800b583f
Reinstate logging
2018-02-06 14:48:30 -08:00
Vangelis Banos
e48ad46a63
Fix typo and block legacy google-analytics.com/ga.js
2018-02-06 14:47:01 -08:00
Vangelis Banos
f54d62ea40
Use Network.setBlockedUrls instead of Debugger to block URLs
2018-02-06 14:47:01 -08:00
Noah Levitt
fc000ff515
bump dev version after PR merge
2018-02-06 12:14:53 -08:00
jkafader
07b961efaf
Merge pull request #85 from nlevitt/claim-batches
...
WIP: claim sites to brozzle in batches to reduce contention over sites table
2018-02-06 12:01:30 -08:00
Noah Levitt
9a0941f1fd
Merge branch 'master' into claim-batches
...
* master:
back to dev version number
commit for beta release
this should fix travis build?
fix tests
update brozzler-easy for current warcprox api
simpleclicks for minutes PDF
2018-02-06 11:46:15 -08:00
Noah Levitt
d36d574e58
Merge pull request #87 from internetarchive/ARI-5294
...
capture citymedfordwi.civicweb.net minutes PDFs
2018-02-05 13:19:11 -08:00
Noah Levitt
95cbfa96e2
back to dev version number
2018-02-02 16:54:29 -08:00
Noah Levitt
2a0ad6d0de
commit for beta release
1.1b12
2018-02-02 16:52:42 -08:00
Noah Levitt
9ba58de292
this should fix travis build?
2018-02-02 16:25:56 -08:00
Noah Levitt
8505720c41
fix tests
2018-02-02 15:11:26 -08:00
Noah Levitt
5331aca33f
update brozzler-easy for current warcprox api
2018-02-02 14:28:46 -08:00
Noah Levitt
7962444f09
claim sites to brozzle in batches to reduce contention over sites table
2018-02-02 13:56:24 -08:00
jkafader
a125434563
Merge pull request #83 from nlevitt/fifteen-minutes
...
lengthen site session brozzling time to 15 minutes
2018-01-29 15:59:16 -08:00
Noah Levitt
64211475c0
lengthen site session brozzling time to 15 minutes
...
This should reduce contention over the "sites" table, which should help
keep more available browsers busy across the cluster.
2018-01-29 15:34:54 -08:00
Noah Levitt
4d37f88bcb
Merge pull request #75 from galgeek/pageInterstitialShown
...
log Page.interstitialShown
2018-01-26 16:18:22 -08:00
Noah Levitt
0e17205e17
Merge pull request #82 from vbanos/websock-tcp-nodely
...
Use TCP_NODELAY in websocket connection to improve performance
2018-01-26 12:14:44 -08:00
Noah Levitt
ba8d5a3740
fix needs_browsing check
...
correctly handle relative url "location" response header
2018-01-26 11:00:46 -08:00
Noah Levitt
bf5401283e
new test test_needs_browsing
...
currently exposes bug in resolving "location" response header
2018-01-26 10:59:18 -08:00
Noah Levitt
67d5a0e671
increase timeout waiting for screenshot
...
because we are seeing timeouts on moderately busy machines
2018-01-26 10:19:23 -08:00
Vangelis Banos
3b0d1203c3
Use TCP_NODELAY in websocket connection to improve performance
2018-01-25 22:39:32 +00:00
Barbara Miller
bc21b325d7
simpleclicks for minutes PDF
2018-01-23 11:43:35 -08:00
Noah Levitt
c934759852
pass canonicalized url to youtube-dl
...
avoids this kind of error:
wbgrp-svc294 2018-01-19 21:04:43,973 648 ERROR BrozzlingThread:39295 youtube_dl.to_stderr(YoutubeDL.py:514) ERROR: Unable to download webpage: <urlopen error no host given> (caused by URLError('no host given',))
wbgrp-svc294 2018-01-19 21:04:43,973 648 ERROR BrozzlingThread:39295 root.brozzle_site(worker.py:521) proxy error (site.proxy=wbgrp-svc400.us.archive.org:8002), will try to choose a healthy instance next time site is brozzled: youtube-dl hit apparent proxy error from https:/www.laphil.com/press1718
2018-01-22 14:52:54 -08:00
Noah Levitt
c22e81341a
bump version after pull request merge
2018-01-19 15:02:55 -08:00
Noah Levitt
7f78c335e1
--warcprox-auto distribute assigned sites evenly ( #78 )
...
--warcprox-auto distribute assigned sites evenly
When running with --warcprox-auto, choose the instance of warcprox with
the least number of assigned sites, instead of the lowest load in the
service registry. In practice we often start brozzling a whole bunch of
sites at approximately the same time, and because it takes time for that
to affect the "load" reported by warcprox instances, sites end up being
distributed very unevenly.
2018-01-19 14:54:33 -08:00
Noah Levitt
9e80a3b0d3
Merge pull request #71 from internetarchive/brofurb
...
JS class-based generalized behavior
2018-01-18 12:23:18 -08:00
Barbara Miller
2f3f258856
update copyright dates
2018-01-15 19:39:41 -08:00
Barbara Miller
e52ba4c8ef
rm default.js
2018-01-15 19:38:15 -08:00
Barbara Miller
93ceeacfd7
rm obsolete
2018-01-15 19:36:32 -08:00
Barbara Miller
2ce9cf41a1
resolve conflicts
2018-01-15 19:34:47 -08:00
Barbara Miller
9aa670ece5
simple multi-selector test with window.scroll
2018-01-15 17:58:10 -08:00
Barbara Miller
7dccc809d0
use shorter interval
2018-01-15 17:58:10 -08:00
Barbara Miller
06a2b5f817
tidied
2018-01-15 17:58:10 -08:00
Barbara Miller
b979372e85
update copyright
2018-01-15 17:58:10 -08:00
Barbara Miller
93a81a4a37
qa simpleIntervalFunc for now
2018-01-15 17:58:10 -08:00
Barbara Miller
b589324a05
add simplerIntervalFunc...
2018-01-15 17:58:10 -08:00
Barbara Miller
f78e1ff710
minor edits
2018-01-15 17:58:10 -08:00
Barbara Miller
d0203ff9eb
tweaks post-troubleshooting ARI-5241
2018-01-15 17:58:10 -08:00
Barbara Miller
dd3b041eec
class-based generalized behavior
2018-01-15 17:58:10 -08:00
Barbara Miller
34fb4baf00
WIP: class-based generalized behavior
2018-01-15 17:58:10 -08:00
Barbara Miller
b968397fbe
update default selectors
2018-01-15 17:58:10 -08:00
Barbara Miller
e364b79796
refurb behaviors.yaml 171015
2018-01-15 17:58:10 -08:00
Noah Levitt
016bd5d3f7
Merge pull request #77 from vbanos/chrome-stop-del-tmpdir
...
Fix to delete tmpdir on Chrome.stop()
2018-01-15 10:36:50 -08:00
Vangelis Banos
820c7cd8cc
Fix to delete tmpdir on Chrome.stop()
...
The ``self._home_tmpdir.cleanup()`` cmd is not always executed when
stopping Chrome. As a result, a large number of ``/tmp/tmpXXX`` dirs are
created in production.
The reason is that ``Chrome.stop()`` execution can stop in the ``return``
statement in the following line:
https://github.com/internetarchive/brozzler/blob/master/brozzler/chrome.py#L268
and ``cleanup()`` does not run.
Moving the ``cleanup()`` in the ``finally`` part of the
``try/catch/finally`` block makes it run always in the end of
``Chrome.stop()`` and cleans up the tmp directory in any case.
2018-01-15 13:09:43 +00:00
Noah Levitt
4f37dc0104
Merge pull request #73 from vbanos/configurable-js-templates
...
Configurable JS templates location
2018-01-10 11:43:16 -08:00