How to input PDF with a prompt to GPT API

In the age of digital transformation, the ability to extract and process information from various document formats, including PDFs, is invaluable. If you have a PDF document and want to perform queries such as “What does this PDF contain?”, “Summarize the PDF”, or ask specific questions about the PDF data, then this Answer is for you.

The user’s required input is the PDF document and the specific query they want answered based on the PDF data.

We will leverage the capabilities of OpenAI and AskYourPDF to answer the user’s query.

AskYourPDF will process the PDF document and retrieve the data relevant to the user’s query.
OpenAI's chat completion API will be used to generate a query for AskYourPDF to extract the necessary data.

Once the relevant data is retrieved, OpenAI's chat completion API will be used again to formulate a response. The extracted data, combined with the user’s original query, will be included in the prompt as context to ensure an accurate answer based on the data in the PDF.

Introduction to AskYourPDF and OpenAI GPT

AskYourPDF is a cutting-edge solution that allows developers to programmatically extract valuable information from PDF files and create custom chatbots. On the other hand, OpenAI GPT is a powerful language model that can generate human-like text based on prompts.

def chat_with_document(doc_id, message):
    headers = {
        'Content-Type': 'application/json',
        'x-api-key': ASKYOURPDF_API_KEY
    }
    data = [
        {
            "sender": "User",
            "message": message
        }
    ]
    response = requests.post(f'https://api.askyourpdf.com/v1/chat/{doc_id}', headers=headers, data=json.dumps(data))
    if response.status_code == 200:
        return response.json()['answer']['message']
    else:
        raise Exception(f"Error chatting with document: {response.status_code}")

Retrieving data from AskYourPDF

import requests
import openai
import json
import os
# Initialize OpenAI API
openai.api_key = os.environ['SECRET_KEY_OPENAI']
# AskYourPDF API Key
ASKYOURPDF_API_KEY = os.environ['SECRET_KEY_AskYourPDF']
def upload_pdf_to_askyourpdf(file_path):
    headers = {
        'x-api-key': ASKYOURPDF_API_KEY
    }
    with open(file_path, 'rb') as file_data:
        response = requests.post('https://api.askyourpdf.com/v1/api/upload', headers=headers, files={'file': file_data})
        if response.status_code == 201:
            return response.json()['doc_id']
        else:
            raise Exception(f"Error uploading PDF: {response.status_code}")
def chat_with_document(doc_id, message):
    headers = {
        'Content-Type': 'application/json',
        'x-api-key': ASKYOURPDF_API_KEY
    }
    data = [
        {
            "sender": "User",
            "message": message
        }
    ]
    response = requests.post(f'https://api.askyourpdf.com/v1/chat/{doc_id}', headers=headers, data=json.dumps(data))
    if response.status_code == 200:
        return response.json()['answer']['message']
    else:
        raise Exception(f"Error chatting with document: {response.status_code}")
def main():
    # Get user input
    user_prompt = input("Enter your query about the PDF: ")
    pdf_file_path = input("Provide the path to your PDF file: ")
    # Use OpenAI GPT to generate a query for AskYourPDF
    extraction_prompt = f"I have a plugin that can extract information from a PDF. Based on the user's query '{user_prompt}', what should I ask the plugin to extract from the document?"
    extraction_query = openai.chat.completions.create(model="gpt-4", messages=[{"role": "assistant", "content": extraction_prompt}]).choices[0].message.content
    # Upload the PDF to AskYourPDF
    doc_id = upload_pdf_to_askyourpdf(pdf_file_path)
    # Extract the content based on GPT's query
    extracted_content = chat_with_document(doc_id, extraction_query)
    # Use GPT to generate an answer based on extracted content and user's initial prompt
    answer_prompt = f"The user asked: '{user_prompt}'. The extracted content from the document is: '{extracted_content}'. How should I respond?"
    answer = openai.chat.completions.create(model="gpt-4", messages=[{"role": "assistant", "content": extraction_prompt}]).choices[0].message.content
    # Print the answer
    print(f"Answer: {answer}")
if __name__ == "__main__":
    main()

If you’re familiar with retrieval-augmented generation (RAG), you might notice that the process described in this answer follows a similar logic.

At its core, RAG involves retrieving relevant information from an external source and using that information to enhance an LLM’s output. In our case, AskYourPDF handles the retrieval by surfacing the most relevant text chunks from the PDF, and GPT-4 uses those chunks as context to generate a meaningful, informed response.

So while you’re not technically building a full RAG pipeline here, you’re applying the same principles. If you’d like to go deeper into this architecture and build systems that combine search with generation at scale, consider taking this “Retrieval-Augmented Generation” Course.

Conclusion

By integrating AskYourPDF with OpenAI GPT, you can create powerful applications that extract and process information from PDFs based on user prompts. This approach not only enhances user experience but also opens up new avenues for document analysis and interaction.

Frequently asked questions

Haven’t found what you were looking for? Contact Us

Do I need to use both AskYourPDF and OpenAI GPT?

Yes. AskYourPDF is responsible for extracting content from the PDF, while OpenAI GPT interprets that content and generates meaningful responses. They work together: one retrieves, the other reasons.

Can I replace AskYourPDF with my own PDF parser?

You can, but you’ll need to handle text extraction and chunking yourself. AskYourPDF simplifies this process and includes its own chat endpoint for querying extracted data.

How is this different from uploading a PDF directly to ChatGPT?

ChatGPT (with tools enabled) can process PDFs in a limited way, but it doesn’t offer fine control over retrieval or workflow automation. Using AskYourPDF and GPT via API gives you full programmatic access and flexibility.

What kinds of tasks work best with this setup?

This setup works well for summarization, answering factual questions from structured or semi-structured documents, extracting key insights, or supporting customer queries based on documentation.

Is this approach scalable for many PDFs or ongoing document streams?

For small-scale or interactive workflows, yes. For larger scale, consider batching uploads, using vector databases, or switching to a more traditional retrieval-augmented generation (RAG) setup.

Does the model really answer based on the PDF or just guess?

If the AskYourPDF step is correctly implemented, GPT responds with context directly retrieved from the PDF, not guessing. That’s the advantage of this retrieval-first approach.

How to input PDF with a prompt to GPT API

Introduction to AskYourPDF and OpenAI GPT

Quick setup

Step 1: Uploading the PDF to AskYourPDF

Step 2: Generating a query for AskYourPDF using OpenAI GPT

Step 3: Extracting content from the PDF using AskYourPDF

Step 4: Generating a response for the user using OpenAI GPT

Bringing it all together