1625 Commits

Author SHA1 Message Date
Barbara Miller
533a5e74ee Merge branch 'requestIntercepted' into qa 2019-05-14 12:00:23 -07:00
Barbara Miller
47721fc1b5 log Network.requestIntercepted 2019-05-14 11:59:59 -07:00
Noah Levitt
ee8ef23f0c fix mistake in job-conf.rst 2019-04-30 10:49:48 -07:00
Noah Levitt
411b3f266a bump version after merge 2019-04-09 22:07:51 +00:00
Noah Levitt
d4386491df
Merge pull request #151 from nlevitt/no-cerberus-normalize
don't attempt cerberus normalization
2019-04-09 15:06:17 -07:00
Noah Levitt
5385232b40 don't attempt cerberus normalization
which encumbers the validation with additional requirements,
specifically makes it difficult to validate a subclass of `dict` because
it expects a constructor that works like dict.__init__()
2019-04-09 01:45:37 -07:00
Noah Levitt
f2a9908395 travis only has py 3.7 for xenial 2019-03-18 16:20:54 -07:00
Noah Levitt
d729c8d0d5 use yaml.safe_load()
getting new warnings
see https://github.com/yaml/pyyaml/wiki/PyYAML-yaml.load(input)-Deprecation
2019-03-18 15:49:44 -07:00
Noah Levitt
6f5f090c33 test py 3.7 2019-03-18 15:49:03 -07:00
Noah Levitt
ef981706f4 fix rethinkdb dependency version 2019-03-18 15:08:36 -07:00
Noah Levitt
c686fc7443 Merge branch 'master' into qa
* master:
  peg to working doublethink
  Use disk cache params only on Chrome.start
  Remove stale comment
  Improve disk cache options
  Add disk cache options to Chrome
  update (C)
2019-03-14 20:06:12 +00:00
Noah Levitt
61274ae994 peg to working doublethink
see: https://github.com/internetarchive/doublethink/commit/f7fc7da725c9b
2019-03-14 20:04:09 +00:00
Noah Levitt
7d5bb4b5d4
Merge pull request #148 from vbanos/disk-cache
Add disk cache options to Chrome
2019-02-12 14:39:49 -08:00
Vangelis Banos
9c48a6fa11 Use disk cache params only on Chrome.start
Use `disk_cache_dir` and `disk_cache_size` only on `Chrome.start` and
not on `Chrome.__init__`.

Drop `disk_cache_dir` and `disk_cache_size` class attributes.
2019-02-12 20:59:08 +00:00
Vangelis Banos
adeca823dd Remove stale comment 2019-02-12 07:21:44 +00:00
Vangelis Banos
31e611771e Improve disk cache options
Remove `--disable-cache`, its not used any more.

Rename `disk_cache` to `disk_cache_dir` and use only path (str)
argument.

Decouple `--disk-cache-size` from `--disk-cache-dir` so it is possible
to use either or both.
2019-02-07 07:42:45 +00:00
Vangelis Banos
c288c9ae98 Add disk cache options to Chrome
Add `Chrome` options `disk_cache` and `disk_cache_size` which add chromium
options `--disk-cache-dir=<DIR>` and `--disk-cache-size=N` (bytes).
The default is to use `--disable-cache` (no disk caching).

There are two ways to use the new vars, if you just use
`Chrome(disk_cache=True)` the chromium cli option `--disable-cache` is
NOT used and chromium writes disk cache inside profile dir.

If you use `Chrome(disk_cache='/tmp/custom_dir', disk_cache_size=10000)`
chromium will use `--disk-cache-dir=/tmp/custom_dir
--disk-cache-size=10000`.
2019-02-06 16:22:10 +00:00
Noah Levitt
809ea3885f
Merge pull request #147 from galgeek/bye_simpleclicks
no more simpleclicks/mouseovers
2019-01-14 13:48:48 -08:00
Barbara Miller
f6ffb4acea update (C) 2019-01-10 16:11:24 -08:00
Barbara Miller
46f64fb5e3 Merge branch 'bye_simpleclicks' into qa 2019-01-10 16:08:10 -08:00
Barbara Miller
9001156b54 rm simpleclicks.js.j2 mouseovers.js.j2 2019-01-10 15:58:38 -08:00
Barbara Miller
770ea6de1e no more simpleclicks/mouseovers 2019-01-10 15:54:47 -08:00
Barbara Miller
2aade83017 Merge branch 'instaInterval' into qa 2018-12-21 17:13:42 -08:00
Barbara Miller
db5df9cdd4 interval 2000 2018-12-21 17:13:22 -08:00
Barbara Miller
e1ceb87ca2
Merge pull request #146 from nlevitt/https-redirect
least surprise on http/https seed redirects
2018-12-21 15:26:04 -08:00
Noah Levitt
a74f46dc53 least surprise on http/https seed redirects
if http://foo.com/ redirects to https://foo.com/a/b/c let's also
put all of https://foo.com/ in scope
2018-12-21 15:17:31 -08:00
Barbara Miller
8b93c078b7 Merge branch 'instaInterval' into qa 2018-12-21 14:41:27 -08:00
Barbara Miller
93e5769428 instagram interval 1000 2018-12-21 14:41:04 -08:00
Noah Levitt
6b8e597a43 bump version after merge 2018-12-20 11:30:49 -08:00
Noah Levitt
0a08c01461
Merge pull request #145 from galgeek/no-skipIframes
no skipIframes for umbraBehavior
2018-12-20 11:30:28 -08:00
Barbara Miller
bf8bbfba27 Merge branch 'no-skipIframes' into qa 2018-12-20 11:25:54 -08:00
Barbara Miller
047b46bc4e back out now unnecessary updates 2018-12-20 11:25:06 -08:00
Barbara Miller
d8f97e7b3f no current need for skipIframes with new try/catch 2018-12-20 11:24:30 -08:00
Noah Levitt
034f7938c4 catch common exception in default behavior 2018-12-20 10:46:05 -08:00
Noah Levitt
2cd64811b3 bump version after merge 2018-12-17 15:10:26 -08:00
Noah Levitt
d8c9dd2ff4
Merge pull request #144 from galgeek/umbraBehavior18q4
fix instagram captures; add skipIframe feature
2018-12-17 15:09:52 -08:00
Barbara Miller
921261df39 Merge branch 'umbraBehavior18q4' into qa 2018-12-17 15:05:46 -08:00
Barbara Miller
4a0d95277f update umbraBehavior 2018-12-17 15:04:36 -08:00
Barbara Miller
cbd6f0f90a Merge branch 'insta18q4' into qa 2018-12-13 17:29:36 -08:00
Barbara Miller
425d44bf4a updates for jina2 2018-12-13 17:27:15 -08:00
Barbara Miller
6c21a9f773 iframe option and other instagram updates 2018-12-13 15:54:10 -08:00
Noah Levitt
15870e6010 avoid IndexError
in some cases we receive this event from the browser:
{"method":"ServiceWorker.workerVersionUpdated","params":{"versions":[]}}
2018-12-13 15:49:38 -08:00
Noah Levitt
b577fe3c36 log browser uncaught exceptions at debug level
didn't realize these weren't showing up as console messages
2018-12-13 15:45:35 -08:00
Barbara Miller
c50e9637ae Merge branch 'insta18q4' into qa 2018-12-09 14:26:38 -08:00
Barbara Miller
cb0c0f51ef iframe option and other instagram updates 2018-12-09 14:25:59 -08:00
Noah Levitt
b447063099 Merge branch 'master' into qa
* master:
  bump version after merge
  change time limit enforcement
2018-11-29 14:52:32 -08:00
Noah Levitt
ebcc063fe2 bump version after merge 2018-11-29 14:52:11 -08:00
jkafader
898756690f
Merge pull request #142 from nlevitt/service-worker
fetch service worker script with proper headers
2018-11-29 13:42:59 -08:00
jkafader
9c27e829aa
Merge pull request #136 from nlevitt/revert-time-limit
change time limit enforcement
2018-11-29 12:29:35 -08:00
Noah Levitt
983ed7bc60 Merge branch 'service-worker' into qa
* service-worker:
  fix tests
2018-11-27 16:07:35 -08:00