Barbara Miller
faf176793d
handle more/better browsing timeouts
2019-10-04 17:40:01 -07:00
Barbara Miller
4fc7b612a5
Merge branch 'ARI-5995-instagram' into qa
2019-10-04 17:29:57 -07:00
Barbara Miller
be40a0b56f
try only closeSelector
2019-10-04 17:29:39 -07:00
Barbara Miller
3226dfef5c
Merge branch 'ARI-5995-instagram' of github.com:galgeek/brozzler into ARI-5995-instagram
2019-10-04 15:22:38 -07:00
Barbara Miller
a4835f5de7
Merge branch 'ARI-5995-instagram' into qa
2019-10-01 16:36:25 -07:00
Barbara Miller
84b99aec33
open/close, then click through?
2019-10-01 16:35:53 -07:00
Barbara Miller
d6af8d3145
skip downloading videos from instagram user page
2019-10-01 16:08:53 -07:00
Barbara Miller
bc2d4903e8
update copyright
2019-10-01 16:08:53 -07:00
Barbara Miller
dd2921af69
Merge remote-tracking branch 'upstream/master' into qa
2019-10-01 16:06:25 -07:00
Barbara Miller
bbd2bd71bf
handle timeout trying to extract tertiary assets
2019-10-01 15:43:07 -07:00
Noah Levitt
85e6027838
bump version after merge
2019-09-27 10:40:59 -07:00
Noah Levitt
996070b35c
Merge pull request #167 from vbanos/console-debug-only
...
Enable Console and Runtime outputs only when debugging
2019-09-27 10:40:17 -07:00
Vangelis Banos
fed5e6b741
Enable Console and Runtime outputs only when debugging
...
When capturing a page, we receive a LOT of messages from chrome.
Examining these message, we see that we can reduce them a bit to speed
up Brozzler.
We always use `Console.enable` which returns all browser console output.
Also, we always use `Runtime.enable`. Doc says:
https://chromedevtools.github.io/devtools-protocol/1-3/Runtime#method-enable
Enables reporting of execution contexts creation by means of
executionContextCreated event. When the reporting gets enabled the event
will be sent immediately for each existing execution context.
These outputs are useful when debugging but not in production.
If we disable them, we reduce the websocket traffic and improve
performance. With this PR, we enable them only when the current logging
level is `DEBUG`.
Counting the number of messages before and after the change, we see
improvements like:
https://www.gnome.org/technologies/ 220 -> 202 messages.
https://www.whitehouse.gov/issues/budget-spending/ 203 -> 189 messages
2019-09-27 13:24:06 +00:00
Noah Levitt
7273c7c3a2
Merge pull request #166 from CorentinB/facebook-ads-lib
...
Add support for Facebook ads library and fix closing
2019-09-26 14:13:47 -07:00
Barbara Miller
2c29dc0333
make instagram use default interval, like prod
2019-09-26 13:28:26 -07:00
Corentin Barreau
e701e3f101
Add: break after closing the first visible element
2019-09-26 21:44:25 +02:00
Barbara Miller
ce7e7447b7
reset instagram interval closer to default
2019-09-26 12:44:20 -07:00
Barbara Miller
f4c57f5d30
Merge branch 'ARI-5980' into qa
2019-09-25 11:36:42 -07:00
Barbara Miller
4a5bc51a72
still better regex; rm old code already
2019-09-25 11:36:15 -07:00
Corentin Barreau
101f7f2e4a
Remove: useless comment
2019-09-25 19:48:38 +02:00
Corentin Barreau
fb30fb9aa3
Add: isVisible check for close selectors
...
Modify: doTarget - Revert to initial code
2019-09-25 16:19:41 +02:00
Corentin Barreau
5c5743ea11
Fix: closeSelector not being clicked
...
Add: support for facebook.com/ads/library - Open and close metrics for ads
2019-09-25 16:10:59 +02:00
Barbara Miller
53671da941
Merge branch 'ARI-5980' into qa
2019-09-24 16:41:11 -07:00
Barbara Miller
ac9950a1ea
better regex, outlinks.push(m[2])
2019-09-24 16:40:45 -07:00
Noah Levitt
efa185a8dc
Merge pull request #160 from vbanos/behavior-timeout
...
More accurate JS behavior timeout
2019-09-24 12:11:37 -07:00
Noah Levitt
eb30ba0c33
Merge pull request #165 from vbanos/stderr-stdout-exception-handling
...
Improve exception handling when reading STDIN/STDERR
2019-09-24 12:03:06 -07:00
Barbara Miller
1603229315
Merge branch 'ARI-5995-instagram' into qa
2019-09-19 15:21:43 -07:00
Barbara Miller
9054daf3c4
skip downloading videos from instagram user page
2019-09-19 15:20:14 -07:00
Barbara Miller
f0a17da851
Merge branch 'fb-ad' into qa
2019-09-19 15:01:25 -07:00
Barbara Miller
3799f2747c
Merge branch 'ARI-5995-instagram' into qa
2019-09-19 14:58:02 -07:00
Barbara Miller
4a0ce9da04
skip downloading videos from instagram user page
2019-09-19 14:57:17 -07:00
Barbara Miller
c46f29eaae
update copyright
2019-09-19 14:41:55 -07:00
Vangelis Banos
f42ff08da1
Improve exception handling when reading STDIN/STDERR
...
When the chrome process dies and we try to read STDIN/STDERR, we get
`ValueError: I/O operation on closed file` or
`OSError: [Errno 9] Bad file descriptor`.
We modify `readline_nonblock` method to return the buffer it read up to
this point.
2019-09-19 20:08:55 +00:00
Barbara Miller
5ff7536c60
support fb ads pages?
2019-09-16 12:49:41 -07:00
Barbara Miller
4ba6efd9c9
Merge branch 'ARI-5980' into qa
2019-09-10 17:57:47 -07:00
Barbara Miller
2ec284c88b
add selector video to default
2019-09-10 17:56:20 -07:00
Barbara Miller
7bb52faca9
add pop urls using regex for better match
2019-09-10 17:49:37 -07:00
Barbara Miller
0755210b47
Merge branch 'senate-videos' into qa
2019-09-03 14:48:12 -07:00
Barbara Miller
6431f4e803
add pop urls using regex for better match
2019-09-03 14:47:48 -07:00
Barbara Miller
57a5814884
Merge branch 'senate-videos' into qa
2019-08-22 16:28:03 -07:00
Barbara Miller
5b393837b8
add selector video to default
2019-08-22 16:26:24 -07:00
Vangelis Banos
0b28a4a57f
More accurate JS behavior timeout
...
If you use a JS behavior timeout smaller than 7 sec, the JS behavior
will always need 7 sec because `sleep(7)` is hard-coded there.
We make a minor addition to use `min(timeout, 7)` for sleep so it will
finish faster when using a smaller JS behavior timeout.
2019-08-22 21:15:44 +00:00
Barbara Miller
5304ba4491
Merge branch 'senate-videos' into qa
2019-08-21 15:11:04 -07:00
Barbara Miller
14e3d56cd2
add popup urls as outlinks
2019-08-20 15:13:35 -07:00
Noah Levitt
16f886259d
Merge pull request #158 from galgeek/aitfive-1668-soundcoud
...
capture soundcloud user page before capturing tracks
2019-08-15 15:46:55 -07:00
Barbara Miller
c6308fe754
Revert "initial commit"
...
This reverts commit 5368a840665dcf9770ede6006d685ff113c84a3f.
2019-08-08 14:03:30 -07:00
Noah Levitt
94cd6cacb6
bump version after merge
2019-07-18 11:07:27 -07:00
Noah Levitt
726c6effed
Merge pull request #157 from vbanos/block-amp-analytics
...
Block AMP analytics JS script
2019-07-18 11:07:09 -07:00
Barbara Miller
d0c46db746
Merge branch 'aitfive-1668-soundcoud' into qa
2019-07-17 17:45:39 -07:00
Barbara Miller
9cc60449d7
skip downloading tracks from soundcloud user page
2019-07-17 17:45:02 -07:00