Name		Name	Last commit message	Last commit date
parent directory ..
results		results
README.md		README.md
pyproject.toml		pyproject.toml
reddit.py		reddit.py
run.py		run.py
test.py		test.py

README.md

Reddit.com Scraper

This scraper is using scrapfly.io and Python to scrape data from reddit.com.

Full tutorial https://scrapfly.io/blog/how-to-scrape-reddit-social-data/

The scraping code is located in the reddit.py file. It's fully documented and simplified for educational purposes and the example scraper run code can be found in run.py file.

This scraper scrapes:

Reddit subreddit pages for subbreddit and post data.
Reddit post pages for post and comment data.
Reddit user profile pages for post data.
Reddit user profile pages for comment data.

For output examples see the ./results directory.

Fair Use Disclaimer

Note that this code is provided free of charge as is, and Scrapfly does not provide free web scraping support or consultation. For any bugs, see the issue tracker.

Setup and Use

This Reddit.com scraper uses Python 3.10 with scrapfly-sdk package which is used to scrape and parse Reddit's data.

Ensure you have Python 3.10 and poetry Python package manager on your system.
Retrieve your Scrapfly API key from https://scrapfly.io/dashboard and set SCRAPFLY_KEY environment variable:
```
$ export SCRAPFLY_KEY="YOUR SCRAPFLY KEY"
```

Clone and install Python environment:

$ git clone https://github.com/scrapfly/scrapfly-scrapers.git
$ cd scrapfly-scrapers/reddit-scraper
$ poetry install

Run example scrape:
```
$ poetry run python run.py
```

Run tests:

$ poetry install --with dev
$ poetry run pytest test.py
# or specific scraping areas
$ poetry run pytest test.py -k test_subreddit_scraping
$ poetry run pytest test.py -k test_post_scraping
$ poetry run pytest test.py -k test_user_post_scraping
$ poetry run pytest test.py -k test_user_comment_scraping

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reddit-scraper

reddit-scraper

README.md

Reddit.com Scraper

Fair Use Disclaimer

Setup and Use

Files

reddit-scraper

Directory actions

More options

Directory actions

More options

Latest commit

History

reddit-scraper

Folders and files

parent directory

README.md

Reddit.com Scraper

Fair Use Disclaimer

Setup and Use