Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't include scrapy-playwright to my PyInstaller binary #129

Closed
bernardodalfovo opened this issue Oct 7, 2022 · 5 comments
Closed

Can't include scrapy-playwright to my PyInstaller binary #129

bernardodalfovo opened this issue Oct 7, 2022 · 5 comments

Comments

@bernardodalfovo
Copy link

bernardodalfovo commented Oct 7, 2022

Hello,

I am trying to include scrapy-playwright to my binary using PyInstaller.

I have tried a few different setups:

  • When installing scrapy-playwright:
python3 -m venv .venv
source .venv\bin\activate
pip install scrapy-playwright scrapy pyinstaller
PLAYWRIGHT_BROWSERS_PATH=0 playwright install chromium
  • When creating the binaries:
PyInstaller.__main__.run(
    [
        "crawl.py",
        "-F",
        f"--paths=.venv/lib/python3.8/site-packages",
        "--paths=.",
        f"--add-data=.venv/lib/python3.8/site-packages/playwright/driver:playwright/driver",
        "--hidden-import=scrapy_playwright",
        "--hidden-import=scrapy-playwright",
        "--hidden-import=playwright",
        "--clean",
        f"--key={hash}",
    ]
)

But all I get when trying to execute the crawler is:

  • On startup:
2022-10-07 20:19:11 [scrapy.core.downloader.handlers] ERROR: Loading "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler" for scheme "http"
Traceback (most recent call last):
  File "scrapy/core/downloader/handlers/__init__.py", line 49, in _load_handler
  File "scrapy/utils/misc.py", line 61, in load_object
  File "importlib/__init__.py", line 127, in import_module
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 973, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'scrapy_playwright.handler'
  • And later on:
2022-10-07 20:19:12 [scrapy.core.scraper] ERROR: Error downloading <GET https://XXXXXXXXXX.com>
Traceback (most recent call last):
  File "twisted/internet/defer.py", line 1692, in _inlineCallbacks
  File "twisted/python/failure.py", line 518, in throwExceptionIntoGenerator
  File "scrapy/core/downloader/middleware.py", line 49, in process_request
  File "scrapy/utils/defer.py", line 67, in mustbe_deferred
  File "scrapy/core/downloader/handlers/__init__.py", line 74, in download_request
scrapy.exceptions.NotSupported: Unsupported URL scheme 'https': No module named 'scrapy_playwright.handler'

(Obfuscated the URL myself)

@bernardodalfovo
Copy link
Author

I have noticed that @samwillis mentions pyinstaller in a comment on his issue #62, seems to be, at the very least, possible.

@EvanZhou666
Copy link

I have the same problem, has this problem been solved?

@elacuesta
Copy link
Member

Please include a full example. I'm not a PyInstaller user myself, I don't know how to set this up.

@msimoni18
Copy link

@bernardodalfovo In my experience, running into ModuleNotFoundError is an indication that you need to add that module as a hidden import either through the command line or through the spec file. pyinstaller can't always find the packages it needs to collect, so this is a way of adding those.

@elacuesta elacuesta added the Stale label Jun 4, 2024
@elacuesta
Copy link
Member

Closing due to inactivity.

@elacuesta elacuesta closed this as not planned Won't fix, can't repro, duplicate, stale Jul 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants