Eldon beeb4a2a2c Merge pull request #17 from nlevitt/ari-3811
thread dump on SIGQUIT a la java
2014-04-04 15:21:41 -04:00
2014-02-13 01:59:09 -08:00
2014-04-03 21:19:08 -07:00
2014-01-28 00:12:33 -05:00

umbra

Browser automation via chrome debug protocol

Install

Install via pip from this repo.

Run

"umbra" script should be in bin/. load_url.py takes urls as arguments and puts them onto a rabbitmq queue dump_queue.py prints resources discovered by the browser and sent over the return queue.

On ubuntu, rabbitmq install with sudo apt-get install rabbitmq-server should automatically be set up for these three scripts to function on localhost ( the default amqp url ).

Description
brozzler - distributed browser-based web crawler
Readme 6.1 MiB
Languages
Python 84.4%
JavaScript 6.4%
HTML 5.6%
Jinja 3.4%
Shell 0.2%