Remove "proxy" references

This commit is contained in:
Lauren Ko 2020-10-06 14:55:03 -05:00
parent 18d3f5f930
commit 09fe10307d
2 changed files with 4 additions and 16 deletions

View File

@ -14,7 +14,6 @@ Example
id: myjob
time_limit: 60 # seconds
proxy: 127.0.0.1:8000 # point at warcprox for archiving
ignore_robots: false
max_claimed_sites: 2
warcprox_meta:
@ -186,16 +185,6 @@ enforced at the seed level. If a time limit is specified at the top level, it
is inherited by each seed as described above, and enforced individually on each
seed.
``proxy``
~~~~~~~~~
+--------+----------+---------+
| type | required | default |
+========+==========+=========+
| string | no | *none* |
+--------+----------+---------+
HTTP proxy, with the format ``host:port``. Typically configured to point to
warcprox for archival crawling.
``ignore_robots``
~~~~~~~~~~~~~~~~~
+---------+----------+-----------+
@ -226,8 +215,8 @@ to contact the operator if the crawl is causing problems.
+============+==========+===========+
| dictionary | no | ``false`` |
+------------+----------+-----------+
Specifies the ``Warcprox-Meta`` header to send with every request, if ``proxy``
is configured. The value of the ``Warcprox-Meta`` header is a json blob. It is
Specifies the ``Warcprox-Meta`` header to send with every request.
The value of the ``Warcprox-Meta`` header is a json blob. It is
used to pass settings and information to warcprox. Warcprox does not forward
the header on to the remote site. For further explanation of this field and
its uses see

View File

@ -31,16 +31,15 @@ Then you can run brozzler-new-site:
::
(brozzler-ve3)vagrant@brzl:~$ brozzler-new-site --proxy=localhost:8000 http://example.com/
(brozzler-ve3)vagrant@brzl:~$ brozzler-new-site http://example.com/
Or brozzler-new-job (make sure to set the proxy to localhost:8000):
Or brozzler-new-job:
::
(brozzler-ve3)vagrant@brzl:~$ cat >job1.yml <<EOF
id: job1
proxy: localhost:8000 # point at warcprox for archiving
seeds:
- url: https://example.org/
EOF