Talk to PDF using langchain

This project utilizes various libraries and tools to process, analyze, and extract information from the "Agile Practice Guide" PDF document. Here's how the process flows:

Requirements

PyPDF2: For reading PDF files.
langchain_openai: For OpenAI embeddings and API interactions.
langchain.text_splitter: To split text into manageable chunks.
langchain_community.vectorstores: Utilizes FAISS for creating a vector store.
langchain.chains.question_answering: For loading and utilizing the question-answering model.
os: To access environment variables.

Setup

Ensure you have an OpenAI API key set in your environment variables as OPENAI_API_KEY.
Install the required Python packages mentioned in the Requirements section.

Workflow

PDF Reading: Use PdfReader from PyPDF2 to open and read the PDF file.
Text Extraction: Loop through each page of the PDF, extracting text and appending it to a raw text variable.
Text Splitting: Utilize CharacterTextSplitter to split the raw text into smaller, manageable chunks.
Embedding Initialization: Initialize OpenAIEmbeddings for later processing.
Vector Store Creation: Create a FAISS vector store from the text chunks using the embeddings model.
Question-Answering Model Loading: Load a question-answering model chain from langchain.
Query Processing: Perform a similarity search in the document using a predefined query and then invoke the question-answering model with the relevant documents and query to get the answer.

How to Use

Run the provided Python script to process the 'agile-practice-guide-english.pdf'.
The script will automatically handle text extraction, chunking, and querying based on the query provided.
Results from the question-answering model will be displayed in the console.

Note

Replace 'yourpdf.pdf' with the path to your target PDF file and query with the specific question you want to ask about the document content.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Talk to PDF using langchain

Requirements

Setup

Workflow

How to Use

Note

About

Releases

Packages

Languages

gapilongo/ask_pdf

Folders and files

Latest commit

History

Repository files navigation

Talk to PDF using langchain

Requirements

Setup

Workflow

How to Use

Note

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages