Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extractor.py should ignore already extracted PDFs #38

Open
skorasaurus opened this issue Jul 6, 2019 · 0 comments
Open

extractor.py should ignore already extracted PDFs #38

skorasaurus opened this issue Jul 6, 2019 · 0 comments

Comments

@skorasaurus
Copy link
Member

As of now, the extractor will run on all qualified PDFs, even ones that have already been extracted.

As we incrementally add newly released files, re-extracting them again is a waste of time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant