-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BookWorm: allow imports via BookWorm again #10313
Merged
scottbarnes
merged 11 commits into
internetarchive:master
from
scottbarnes:add-bookworm-proxy-support-redux
Jan 21, 2025
Merged
BookWorm: allow imports via BookWorm again #10313
scottbarnes
merged 11 commits into
internetarchive:master
from
scottbarnes:add-bookworm-proxy-support-redux
Jan 21, 2025
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Product Notes: Later, another thing we could do is see if a contributor can:
|
e1cfcf0
to
2fdb6f1
Compare
5012940
to
c07c819
Compare
36eae72
to
ad75ed4
Compare
This commit reenables `/isbn` on both BookWorm and by extension openlibrary.org. Note: this changes the signature of `vendors.AmazonAPI()` to add a `proxy_url` parameter for passing in an HTTP Proxy URL.
Move setup_requests to a place less likely to cause circular imports.
`add_book/__init__()` calls `add_cover()`, which ultimately uses `requests` to fetch covers from, e.g., Amazon. This commit adds support for using the proxy during the `add_book` process.
create_edition_from_amazon_metadata does not appear to be imported for its side effects.
This simply moves some existing code to another section and is meant to make code review easier, so moving code and substantive changes aren't both in the same commit.
Because of the HTTP Proxy requirements, only known cover hosts are used when importing via `load()` (e.g. from `/isbn` or via `/api/import`). `'cover'` values not found in `ALLOWED_COVER_HOSTS` are simply ignored during import to prevent timeouts when trying to connect to hosts not added to the proxy.
Allow cover fetching from books.google.com
This will need an update to `olsystem` to `NO_PROXY` `.openlibrary.org`.
ad75ed4
to
3c84e06
Compare
mekarpeles
approved these changes
Jan 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm pending adding to coverstore
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #10230, #10316
Related: internetarchive/olsystem#253
This commit reenables
/isbn
on BookWorm and by extension openlibrary.org.Note: this changes the signature of
vendors.AmazonAPI()
to add aproxy_url
parameter for passing in an HTTP Proxy URL.It will be passed by default when loaded by BookWorm, but if using
AmazonAPI()
in the REPL onol-home0
, thenproxy_url
will need to be passed when instantiatingAmazonAPI()
.I checked 1000
staged
items from Amazon and every one had the cover hosted atm.media-amazon.com
. I had fewer Google Books items to check, but they seem to all have covers coming frombooks.google.com
.Testing
The BookWorm/
/isbn
part of this was tested by visiting https://openlibrary.org/isbn/6610742898 with the relevant part of the code patch deployed to BookWorm.That resulted in importing https://openlibrary.org/books/OL57448379M/Environmental_Enrichment_for_Captive_Animals, as the Amazon record has no
'cover'
field.The HTTP Proxy having been updated, the following was imported (with cover) from
/isbn/<isbn>
: https://testing.openlibrary.org/books/OL57494524M/Postcolonial_Hauntings.Possibly follow up: numerous ISBNs will have failed import over the past few months, when tried via
/isbn
. They will not now import as they've been marked as failed. It may be desirable to fix this in the database. A staff member will need to do it.Screenshot
Stakeholders
@mekarpeles