mirror of
https://github.com/iipc/awesome-web-archiving.git
synced 2024-10-01 03:15:45 -04:00
Add HadoopConcatGz (#9)
This commit is contained in:
parent
f8b58e3624
commit
e4bf900190
@ -89,6 +89,8 @@ To the extent possible under law, the owner has waived all copyright and related
|
||||
|
||||
#### Utilities
|
||||
|
||||
* [HadoopConcatGz](https://github.com/helgeho/HadoopConcatGz) (Stable) - A Splitable Hadoop InputFormat for Concatenated GZIP Files (and *.warc.gz)
|
||||
|
||||
* [Jwat](https://sbforge.org/display/JWAT/JWAT) (Stable) - Libraries and tools for reading/writting/validating WARC/ARC/GZIP files.
|
||||
|
||||
* [Warcat](https://github.com/chfoo/warcat) (Stable) - Tool and library for handling Web ARChive (WARC) files.
|
||||
|
Loading…
Reference in New Issue
Block a user