From 7e9671f411a877dbc683ca34204cff1db74f395a Mon Sep 17 00:00:00 2001 From: Ross Spencer Date: Mon, 4 Sep 2017 01:41:30 +1200 Subject: [PATCH] Added HTTPreserve tikalinkextract. (#36) --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 79239a7..ae8e3d5 100644 --- a/README.md +++ b/README.md @@ -135,6 +135,8 @@ This list of tools and software is intended to briefly describe some of the most * [The Unarchiver](http://unarchiver.c3.cx/unarchiver) - Program to extract the contents of many archive formats, inclusive of WARC, to a file system. Free variant of The Archive Browser. (OSX only, Proprietary app) +* [tikalinkextract](https://github.com/httpreserve/tikalinkextract) (In Development) - Extract hyperlinks as a seed for web archiving from folders of document types that can be parsed by Apache Tika. (Golang, Apache Tika Server) + * [Warcat](https://github.com/chfoo/warcat) (Stable) - Tool and library for handling Web ARChive (WARC) files. (Python) * [warcio](https://github.com/webrecorder/warcio) - Streaming WARC/ARC library for fast web archive IO. (Python)