How to add message history to LlamaIndex-based GPT-3 in Python

The OpenAI GPT-3 model is an instrument for creating text that resembles human writing. It's versatile and finds usage in diverse applications such as chatbots, content creation, and beyond. GPT-3 possesses the capability to retain message history during a conversation, which makes it possible to produce contextually relevant responses. Nevertheless, when you use the LlamaIndex library for training GPT-3, the way to leverage this feature might not be immediately obvious.

Understanding the problem

Before exploring the solution, let's grasp the challenge first. While using the standard ChatGPT API, maintaining a message history as the context for the conversation is relatively straightforward. Here's a basic example:

Nonetheless, when employing the LlamaIndex library to train GPT-3 with a more focused context, it's not readily apparent how to get the model to consider the message_history as well.

Solution

The solution involves adjusting the way you interact with the llama-index library. Here's a step-wise guide.

Set up the LlamaIndex library

The initial step is to configure the LlamaIndex library with the correct parameters. This includes defining the maximum input size, number of output tokens, maximum chunk overlap, and chunk size limit. You also have to determine the LLM predictor and the context (dataset).

def construct_index(directory_path):
    max_input_size = 4096
    num_outputs = 2000
    max_chunk_overlap = 20
    chunk_size_limit = 600
    prompt_helper = PromptHelper(max_input_size, num_outputs, max_chunk_overlap, chunk_size_limit=chunk_size_limit)
    llm_predictor = LLMPredictor(llm=OpenAI(temperature=0.5, model_name="text-ada-001", max_tokens=num_outputs))
    documents = SimpleDirectoryReader(directory_path).load_data()
    service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)
    index = GPTSimpleVectorIndex.from_documents(documents, service_context=service_context)
    index.save_to_disk("index.json")
    return index

Code explanation

Line 1: This line defines a function named construct_index that takes a directory path as an argument.
Line 2: This line sets the maximum input size for the model to 4096 tokens.
Line 3: This line sets the number of output tokens that the model should generate to 2000.
Line 4: This line sets the maximum overlap between chunks to 20 tokens.
Line 5: This line sets the maximum size of each chunk to 600 tokens.
Line 6: This line creates an instance of the PromptHelper class with the parameters defined above.
Line 7: This line creates an instance of the LLMPredictor class with specified parameters.
Line 8: This line loads documents from the specified directory into memory.
Line 9: This line creates an instance of the ServiceContext class with the LLMPredictor and PromptHelper instances.
Line 10: This line creates an instance of the GPTSimpleVectorIndex class from the loaded documents and the ServiceContext instance.
Line 11: This line saves the created index to a file named "index.json".
Line 12: This line returns the created index.