Barbara Miller
9001449c70
prioritize scrolling down
2019-11-14 17:46:34 -08:00
Barbara Miller
ef70907040
Merge pull request #179 from CorentinB/fix-fb-ads-variants
...
Fix Facebook Ads Library variants selector
2019-11-13 13:08:47 -08:00
Corentin Barreau
beb80da7d2
Fix ads variant selector
2019-11-13 18:11:48 +01:00
Barbara Miller
c66d131886
Merge branch '168' into qa
2019-11-07 16:05:50 -08:00
Noah Levitt
395ff69f0a
bump version after merge
2019-11-06 13:28:45 -08:00
Noah Levitt
802fbff986
Merge pull request #178 from galgeek/ARI-5995-tidied
...
ARI-5995 instagram capture updates
2019-11-06 13:26:56 -08:00
Corentin Barreau
06fba51b7f
Restore 500ms interval speed
2019-11-06 14:11:19 +01:00
Barbara Miller
b4d9b6d20b
Merge branch 'ARI-5995-tidied' into qa
2019-11-05 17:43:32 -08:00
Barbara Miller
ac4a3f9914
simpler check, interval; 500
2019-11-05 17:23:01 -08:00
Barbara Miller
69250359bc
Merge branch 'ARI-5995-min' into qa
2019-11-05 16:26:09 -08:00
Barbara Miller
4f9b6a8fab
skip unneeded check
2019-11-05 16:25:04 -08:00
Barbara Miller
aa8010e93f
Merge branch 'ARI-5995-min' into qa
2019-11-05 16:20:55 -08:00
Barbara Miller
45ffce19ec
skip unneeded check
2019-11-05 16:20:24 -08:00
Barbara Miller
3c5f0e25ff
Merge remote-tracking branch 'upstream/master' into qa
2019-11-04 15:21:48 -08:00
Noah Levitt
754b92cb96
bump version after merge
2019-11-04 15:20:58 -08:00
Noah Levitt
5bbd262144
Merge pull request #177 from CorentinB/fb-ads-variants
...
Add capture of Facebook ads variants
2019-11-04 15:20:42 -08:00
Barbara Miller
bb20b8e621
Merge branch 'ARI-5995-min' into qa
2019-11-04 15:01:32 -08:00
Corentin Barreau
ea021ab568
Add: capture of variant ads
2019-11-03 13:35:09 +01:00
Corentin Barreau
e414658056
Add: childSelector
2019-11-03 13:21:23 +01:00
Corentin Barreau
9b54723802
Change interval speed
2019-10-31 23:05:54 +01:00
Corentin Barreau
c3e4597d1a
Revert "Change interval speed"
...
This reverts commit 473fd9e3936be5b179bf7a0b6091bef91fade0c0.
2019-10-31 23:04:50 +01:00
Corentin Barreau
473fd9e393
Change interval speed
2019-10-31 23:02:59 +01:00
Noah Levitt
a85d95e145
bump version after merge
2019-10-31 15:00:59 -07:00
Noah Levitt
3c85cb34c3
Merge pull request #175 from vbanos/whatwg-outlinks
...
Use urlcanon.whatwg in extracted outlinks
2019-10-31 15:00:31 -07:00
Vangelis Banos
33b7a7f564
Use urlcanon.whatwg in extracted outlinks
...
The aim is to improve outlink quality.
2019-10-31 21:27:55 +00:00
Barbara Miller
37e1c7ed55
rmSelector to remove() login div
2019-10-17 18:03:12 -07:00
Noah Levitt
8beb96817e
bump version after merge
2019-10-17 13:48:23 -07:00
Noah Levitt
e23fa68d65
fix bug clobbering own changes to parent_page
...
and some other tweaks (python 3.5+, pytest logging config, ...)
2019-10-17 13:47:54 -07:00
Noah Levitt
ba85917f70
Merge pull request #172 from vbanos/block-more-analytics
...
Block more google-analytics URLs
2019-10-16 10:49:48 -07:00
Barbara Miller
cd30ba8bfa
Merge branch 'ARI-5995-min' into qa
2019-10-15 16:00:29 -07:00
Barbara Miller
ddf19121fd
limit=1 not firstMatchOnly plus nextAction
2019-10-15 15:59:12 -07:00
Barbara Miller
e99e929b2c
Merge branch 'ARI-5995-min' into qa
2019-10-15 14:49:36 -07:00
Barbara Miller
59f3fcc96e
limit=1 not firstMatchOnly plus nextAction
2019-10-15 14:36:19 -07:00
Barbara Miller
66a29dc8fe
update first close selector
2019-10-15 14:35:54 -07:00
Barbara Miller
c62c9f9063
delay instagram youtube-dl captures; collapse if block
2019-10-15 14:35:33 -07:00
Barbara Miller
a17c34236c
Merge branch 'ARI-5995' into qa
2019-10-15 11:59:28 -07:00
Barbara Miller
5dd64aef0b
try youtube-dl first again
2019-10-15 11:59:13 -07:00
Barbara Miller
f0bd18a74d
limit=1 not firstMatchOnly plus nextAction
2019-10-15 11:53:57 -07:00
Barbara Miller
bb85d02f95
Merge branch 'CorentinB-facebook' into qa
2019-10-11 16:35:09 -07:00
Barbara Miller
ef3546e04b
Merge branch 'facebook' of git://github.com/CorentinB/brozzler into CorentinB-facebook
2019-10-11 16:17:25 -07:00
Vangelis Banos
f23f49108b
Block more google-analytics URLs
...
After analysing capture logs, we see that we didn't block many
google-analytics related URLS which are used for web statistics. We add
these to the blocked URLs.
In addition, we improve existing block rules. We used to block
`*google-analytics.com/analytics.js` but many sites used some kind of
param in the end so these URLs weren't blocked. We add `*` in the end of
the existing rules to block these cases as well.
2019-10-11 10:45:23 +00:00
Noah Levitt
1bda52d4c9
bump version
2019-10-09 16:28:58 -07:00
Noah Levitt
65c7ccdcff
brozzle-page --screenshot-full-page option
2019-10-09 16:28:26 -07:00
Noah Levitt
e5a3ada349
Merge pull request #171 from vbanos/screenshot-full-screen
...
Add option to capture full page screenshot
2019-10-09 16:27:05 -07:00
Barbara Miller
23d32089bb
Merge branch 'ARI-5995' into qa
2019-10-09 10:57:14 -07:00
Vangelis Banos
ba901e3a99
Fix JPEG thumbnail problems
...
Due to the fact that we run JS behaviors before we capture the
screenshot, the browser could be scrolled down in the page. When we
don't capture the full page, we may get a screenshot of the bottom part of
the page and not the top. To fix that we run `window.scroll(0, 0)`
before capturing the screenshot.
We change method `BrozzlerWorker.full_and_thumb_jpegs` to
`BrozzlerWorker.thumb_jpeg`. That's because we already get a JPEG now
from the browser after our changes at `Browser.screenshot`.
`thumb_jpeg` only returns a thumbnail now. There is no need to read PNG
and convert to JPEG. This means that screenshots will be a bit faster
now :)
2019-10-09 13:34:38 +00:00
Vangelis Banos
674da4aa99
Use JPEG quality: 95 for screenshots
2019-10-09 11:57:18 +00:00
Vangelis Banos
544222b021
Moved screenshot code right after run_behavior
...
There were some weird screeshots when invoking `try_screenshot` in the end
after `visit_hashtags` and `extract_outlinks`. The screenshot was
distorted.
2019-10-09 11:39:32 +00:00
Vangelis Banos
c007cda87e
Capture screenshot after running behaviors
...
This is necessary to load all images before taking the screenshot.
2019-10-09 11:05:58 +00:00
Barbara Miller
dafd9241cc
update first close selector
2019-10-08 17:32:24 -07:00