We init `Browser` directly or from `BrowserPool` and we don't always call
Browser.start()`.
`headless` needs to be an init param and not a `start()` param.
Using `--headless` chromium-browser option, we run all Brozzler without
requiring an X display.
https://chromium.googlesource.com/chromium/src/+/lkgr/headless/README.md
Benchmarks are ~20% faster using `--headless`.
This option was introduced in chromium-browser 59 but was unstable. I've
tested it with v73.0.3683.86 and its running well.
With this PR, we add it as an option that is disabled by default.
which encumbers the validation with additional requirements,
specifically makes it difficult to validate a subclass of `dict` because
it expects a constructor that works like dict.__init__()
see https://travis-ci.org/internetarchive/brozzler/jobs/514858838
(unroll "sudo cat /var/log/brozzler-worker.log")
2019-04-02 20:16:01,792 18595 CRITICAL BrozzlingThread:42073 brozzler.worker.BrozzlerWorker.brozzle_site(worker.py:412) unexpected exception
Traceback (most recent call last):
File "/opt/brozzler-ve3/lib/python3.6/site-packages/brozzler/worker.py", line 379, in brozzle_site
enable_youtube_dl=not self._skip_youtube_dl)
File "/opt/brozzler-ve3/lib/python3.6/site-packages/brozzler/worker.py", line 215, in brozzle_page
browser, site, page, on_screenshot, on_request)
File "/opt/brozzler-ve3/lib/python3.6/site-packages/brozzler/worker.py", line 292, in _browse_page
cookie_db=site.get('cookie_db'))
File "/opt/brozzler-ve3/lib/python3.6/site-packages/brozzler/browser.py", line 341, in start
self.websock_url = self.chrome.start(**kwargs)
File "/opt/brozzler-ve3/lib/python3.6/site-packages/brozzler/chrome.py", line 200, in start
return self._websocket_url()
File "/opt/brozzler-ve3/lib/python3.6/site-packages/brozzler/chrome.py", line 247, in _websocket_url
raise e
Exception: chrome process died with status 1
Use `disk_cache_dir` and `disk_cache_size` only on `Chrome.start` and
not on `Chrome.__init__`.
Drop `disk_cache_dir` and `disk_cache_size` class attributes.
Remove `--disable-cache`, its not used any more.
Rename `disk_cache` to `disk_cache_dir` and use only path (str)
argument.
Decouple `--disk-cache-size` from `--disk-cache-dir` so it is possible
to use either or both.
Add `Chrome` options `disk_cache` and `disk_cache_size` which add chromium
options `--disk-cache-dir=<DIR>` and `--disk-cache-size=N` (bytes).
The default is to use `--disable-cache` (no disk caching).
There are two ways to use the new vars, if you just use
`Chrome(disk_cache=True)` the chromium cli option `--disable-cache` is
NOT used and chromium writes disk cache inside profile dir.
If you use `Chrome(disk_cache='/tmp/custom_dir', disk_cache_size=10000)`
chromium will use `--disk-cache-dir=/tmp/custom_dir
--disk-cache-size=10000`.