Skip to content

Commit df90e1a

Browse files
committed
Add static spider (mostly for testing)
1 parent 423aea2 commit df90e1a

File tree

2 files changed

+24
-1
lines changed

2 files changed

+24
-1
lines changed

README.md

+11-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,15 @@ so that one can get started easily.
1212

1313
## Scraped site
1414

15-
This spider returns quotes from [quotes.toscrape.com](https://quotes.toscrape.com).
15+
This project contains two spiders. The `quotes` spider returns quotes from
16+
[quotes.toscrape.com](https://quotes.toscrape.com).
17+
18+
The `static` spider returns a single dummy quote without accessing the network.
19+
This can be used for testing. There are several settings and environment variables
20+
that modify its behaviour:
21+
- spider setting `STATIC_TEXT` - quote text (default _To be, or not to be_)
22+
- spider setting `STATIC_AUTHOR` - quote author (default _Shakespeare_)
23+
- environment variable `STATIC_TAGS` - quote tags (default _static_)
1624

1725
## Running locally
1826

@@ -33,6 +41,7 @@ $ scrapy list
3341
```
3442
> ```
3543
> quotes
44+
> static
3645
> ```
3746
```sh
3847
$ scrapy crawl quotes
@@ -67,6 +76,7 @@ docker run --rm example scrapy list
6776
```
6877
> ```
6978
> quotes
79+
> static
7080
> ```
7181
7282
```sh

example/spiders/static_spider.py

+13
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
import os
2+
from scrapy import Spider
3+
4+
class StaticSpider(Spider):
5+
name = "static"
6+
start_urls = ["file:///dev/null"]
7+
8+
def parse(self, response):
9+
yield {
10+
'text': self.settings.get('STATIC_TEXT', 'To be or not to be'),
11+
'author': self.settings.get('STATIC_AUTHOR', 'Shakespeare'),
12+
'tags': os.getenv('STATIC_TAGS', 'static')
13+
}

0 commit comments

Comments
 (0)