mirror of
https://github.com/sys-nyx/red-arch.git
synced 2025-05-06 08:45:31 -04:00
Update README.md
This commit is contained in:
parent
78a9e155d6
commit
da0624f40e
1 changed files with 12 additions and 12 deletions
24
README.md
24
README.md
|
@ -6,26 +6,26 @@ pulls reddit data from the [pushshift](https://github.com/pushshift/api) api and
|
|||
|
||||
requires python 3 on linux, OSX, or Windows
|
||||
|
||||
sudo apt-get install pip
|
||||
pip install psaw
|
||||
git clone https://github.com/chid/snudown
|
||||
cd snudown
|
||||
sudo python setup.py install
|
||||
cd ..
|
||||
git clone [this repo]
|
||||
cd reddit-html-archiver
|
||||
chmod u+x *.py
|
||||
$ sudo apt-get install pip
|
||||
$ pip install psaw
|
||||
$ git clone https://github.com/chid/snudown
|
||||
$ cd snudown
|
||||
$ sudo python setup.py install
|
||||
$ cd ..
|
||||
$ git clone [this repo]
|
||||
$ cd reddit-html-archiver
|
||||
$ chmod u+x *.py
|
||||
|
||||
Windows users may need to run
|
||||
|
||||
chcp 65001
|
||||
set PYTHONIOENCODING=utf-8
|
||||
> chcp 65001
|
||||
> set PYTHONIOENCODING=utf-8
|
||||
|
||||
before running `fetch_links.py` or `write_html.py` to resolve encoding errors such as 'codec can't encode character'.
|
||||
|
||||
### fetch reddit data
|
||||
|
||||
data is fetched by subreddit and date range and is stored as csv files in `data`. You may need to explicitly run the script python3 if it is not the default on your system.
|
||||
data is fetched by subreddit and date range and is stored as csv files in `data`. You may need to explicitly run the script with python3 if it is not the default on your system.
|
||||
|
||||
$ python3 ./fetch_links.py politics 2017-1-1 2017-2-1
|
||||
# or add some link/post filtering to download less data
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue