Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] Please add headers to the ExtractParams for the extract() method #1314

Open
nicholas-johnson-techxcel opened this issue Mar 10, 2025 · 0 comments

Comments

@nicholas-johnson-techxcel

Problem Description
I do not understand why scrape_url() supports headers for authenticating against websites, but extract() does not. I either need to use extract() with auth headers, or grab the data with scrape_url() and headers, and then pass it into the extract feature to format the data.

Even with scrape_url() passing the auth headers from a logged in browser then trips a captcha and we have seemingly no recourse. Is there some way that we can guard against fingerprinting? Could the captcha be returned via the API for us to solve, if the captcha is inevitable?

Proposed Feature
Add headers to ExtractParams please.

Alternatives Considered
There are not many things I can do about it.

Implementation Suggestions
Would it be any more complicated than just adding the headers param and passing it through the scraper engine and attaching the headers to the scraper requests?

Use Case
A site with authentication that needs to be scraped. This is just a subset of the actual issue that Firecrawl does not support logging in and that the agent is generally disobedient. You cannot ask the LLM in the prompt to enter your username and password and login. It will not even obey things like 'From this URL click on the first item in the table on the page under the column 'Name'. It would be helpful if there were a verbose debugging mode where we could have some visibility on what was happening behind the scenes.

Then there is no prompt available in scrape_url() so I cannot use scrape_url() for this.

Thanks for hearing me out.

Additional Context
N/A

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant