Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better error handling #2

Open
wvengen opened this issue Oct 25, 2023 · 6 comments
Open

Better error handling #2

wvengen opened this issue Oct 25, 2023 · 6 comments
Labels
API API & Web config Configuration docker Docker enhancement New feature or request k8s Kuberenetes

Comments

@wvengen
Copy link
Member

wvengen commented Oct 25, 2023

A small number of error-cases is handled (i.e. an error API response is returned).
Add a default error handler that returns what scrapyd would return on an error.
And handle more cases with helpful error messages.

@wvengen wvengen added the enhancement New feature or request label Dec 6, 2023
@wvengen
Copy link
Member Author

wvengen commented Jan 29, 2024

Things to look at (not an exhaustive list):

  • missing query / form parameters in API request (some are handled, some not) (done)
  • invalid query / form parameters in API request (e.g. unknown project, unknown spider, unknown version) (done)
  • missing config file entries (e.g. project repository)
  • project repository that does not exist (done)
  • project repository that cannot be contacted (for a remote repository)
  • Kubernetes connection error (and other errors regarding connecting the cluster)
  • Docker connection error (and other errors regarding connecting to the Docker daemon)

@wvengen
Copy link
Member Author

wvengen commented Jan 29, 2024

Here is the scrapyd code for the API, having several error responses.
https://github.com/scrapy/scrapyd/blob/master/scrapyd/webservice.py

@wvengen wvengen added k8s Kuberenetes docker Docker API API & Web config Configuration labels Jan 31, 2024
@wvengen
Copy link
Member Author

wvengen commented Jan 31, 2024

Note that if you're working on this with a team, it can make sense to split this issue up in multiple smaller ones.

@wvengen
Copy link
Member Author

wvengen commented Feb 9, 2024

Did some tests on the original scrapyd, but error handling is not very well worked out:

  • when requesting a non-existant endpoint, 404 status with generic HTML error page is returned
  • when an error message is returned, status is still 200
  • listspiders on a nonexistant project returns status 200 with content
    {"node_name": "x", "status": "error", "message": "Scrapy 2.9.0 - no active project\n\nUnknown command: list\n\nUse \"scrapy\" to see available commands\n"}

I think we can follow that.
We don't need to have the exact error messages as scrapyd (they can be quite cryptic).

@wvengen
Copy link
Member Author

wvengen commented Feb 9, 2024

Errors related to invalid query and form parameters have been (mostly?) fixed.
You can still schedule a project with an non-existant version, which means it will be stuck in the pending state.
You can still schedule a project with an non-existant spider, which means it will error immediately and be finished.

@wvengen
Copy link
Member Author

wvengen commented Feb 9, 2024

A next step could be to handle exceptions and return a JSON-based error message, instead of http 500 server error (this may break clients).
Ideally, this only happens for the API endpoints.

The main config-related error (where the daemon doesn't crash before starting) is when a project repository is missing in the configuration. That would be something to improve, either on load or when it is being used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API API & Web config Configuration docker Docker enhancement New feature or request k8s Kuberenetes
Projects
No open projects
Status: Todo
Development

No branches or pull requests

1 participant