Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify Response object #328

Closed
Ehsan-U opened this issue Nov 13, 2024 · 1 comment
Closed

Modify Response object #328

Ehsan-U opened this issue Nov 13, 2024 · 1 comment
Labels

Comments

@Ehsan-U
Copy link

Ehsan-U commented Nov 13, 2024

Thank you for creating such an amazing package! My goal is to retrieve only the intercepted json_data without the full HTML page content. Is there a way to set intercepted json_data as body to Response object, that received in the parse callback?

    def start_requests(self):
        url = "https://littlecaesars.com/en-us/order/pickup/stores/search/75215/"
        yield scrapy.Request(url, callback=self.parse, meta={
            "playwright": True,
            "playwright_page_methods": [
                PageMethod("route", "**/api/GetClosestStores", self.capture_request),
                PageMethod("wait_for_selector", "//button[contains(text(), 'Start your order')]"),
            ]
        })


    async def capture_request(self, route: Route):
        response = await route.fetch()
        json_data = await response.json()
        await route.fulfill(response=response, json=json_data)


    def parse(self, response: Response):
        pass

( Due to reCAPTCHA protection, using Playwright is essential here )

@elacuesta
Copy link
Member

I'd suggest not to set a custom route, it's usually not a good idea given all the processing done in the handler. I think you can do something similar to this, i.e. intercepting the "response" event, extracting what you need, storing it somewhere temporarily, retrieving it in the main response callback and returning it there. You might need something like an asyncio.Event if you ran into synchronization issues e.g. if the callback runs before the response event handler (though I'm just thinking out loud, not sure that's even an actual possibility).

@elacuesta elacuesta closed this as not planned Won't fix, can't repro, duplicate, stale Feb 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants