mirror of
https://github.com/sys-nyx/red-arch.git
synced 2025-05-06 08:45:31 -04:00
Update README.md
This commit is contained in:
parent
78a9e155d6
commit
da0624f40e
1 changed files with 12 additions and 12 deletions
24
README.md
24
README.md
|
@ -6,26 +6,26 @@ pulls reddit data from the [pushshift](https://github.com/pushshift/api) api and
|
||||||
|
|
||||||
requires python 3 on linux, OSX, or Windows
|
requires python 3 on linux, OSX, or Windows
|
||||||
|
|
||||||
sudo apt-get install pip
|
$ sudo apt-get install pip
|
||||||
pip install psaw
|
$ pip install psaw
|
||||||
git clone https://github.com/chid/snudown
|
$ git clone https://github.com/chid/snudown
|
||||||
cd snudown
|
$ cd snudown
|
||||||
sudo python setup.py install
|
$ sudo python setup.py install
|
||||||
cd ..
|
$ cd ..
|
||||||
git clone [this repo]
|
$ git clone [this repo]
|
||||||
cd reddit-html-archiver
|
$ cd reddit-html-archiver
|
||||||
chmod u+x *.py
|
$ chmod u+x *.py
|
||||||
|
|
||||||
Windows users may need to run
|
Windows users may need to run
|
||||||
|
|
||||||
chcp 65001
|
> chcp 65001
|
||||||
set PYTHONIOENCODING=utf-8
|
> set PYTHONIOENCODING=utf-8
|
||||||
|
|
||||||
before running `fetch_links.py` or `write_html.py` to resolve encoding errors such as 'codec can't encode character'.
|
before running `fetch_links.py` or `write_html.py` to resolve encoding errors such as 'codec can't encode character'.
|
||||||
|
|
||||||
### fetch reddit data
|
### fetch reddit data
|
||||||
|
|
||||||
data is fetched by subreddit and date range and is stored as csv files in `data`. You may need to explicitly run the script python3 if it is not the default on your system.
|
data is fetched by subreddit and date range and is stored as csv files in `data`. You may need to explicitly run the script with python3 if it is not the default on your system.
|
||||||
|
|
||||||
$ python3 ./fetch_links.py politics 2017-1-1 2017-2-1
|
$ python3 ./fetch_links.py politics 2017-1-1 2017-2-1
|
||||||
# or add some link/post filtering to download less data
|
# or add some link/post filtering to download less data
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue