mirror of
https://github.com/iipc/awesome-web-archiving.git
synced 2024-10-01 03:15:45 -04:00
added html2warc (#16)
added html2warc, a simple script to convert offline data into a single warc file
This commit is contained in:
parent
4e413e2342
commit
c370e303dc
@ -54,6 +54,8 @@ To the extent possible under law, the owner has waived all copyright and related
|
||||
|
||||
* [Heritrix](https://webarchive.jira.com/wiki/display/Heritrix/Heritrix) (Stable) - An open source, extensible, web-scale, archival quality web crawler.
|
||||
|
||||
* [html2warc](https://github.com/steffenfritz/html2warc) (Stable) - A simple script to covert offline data into a single warc file
|
||||
|
||||
* [grab-site](https://github.com/ludios/grab-site) (Stable) - The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns.
|
||||
|
||||
* [HTTrack](http://www.httrack.com/) (Stable) - An open source website copying utility.
|
||||
|
Loading…
Reference in New Issue
Block a user