What is OpenAI's chat completions API?

Key takeaways

  • The OpenAI API allows developers to integrate advanced natural language processing capabilities into applications, enabling human-like text generation based on user input.

  • The chat completions API supports a variety of tasks, including text classification, generation, and transformation, and responds to user messages with context-aware output.

  • Key request parameters include model, messages, max_tokens, temperature, and more.

  • The API returns a JSON object that includes a unique ID and an array of choices, each containing valuable information about the generated response.

OpenAI's ChatGPT API is a service that allows developers to integrate natural language processing capabilities into applications, enabling interactions where the AI generates human-like text based on the input it receives. This API is designed to handle a wide range of conversational tasks, such as answering questions and providing recommendations, all based on the context supplied to it.

The chat completions API

The chat completions API serves various functions on text, such as classification, generation, transformation, completing incomplete text, providing factual responses, and more. It requires an input message from the user along with its assigned role, then returns the output.

Let's examine the chat completions API in more detail, reviewing the request and response parameters.

What are the request parameters for the chat completions API?

Let’s see some essential request parameters for this API in the table below:

Request parameters for chat completions API

Fields

Format

Type

Description

model

String

Required

This is the ID of the model that the chat completions endpoint will use.

messages

Object

Required

This provides the context for generating responses. It is an array of message objects, each with the following fields:

  • role (string): Specifies the sender of the message. Possible values are: "system": Initial instructions or directives for the model, "user": Messages from the user interacting with the model and "assistant": Responses generated by the model.
  • content (string): The text of the message.


max_tokens

Integer

Optional

This is the maximum number of tokens to generate in the chat completion.

temperature

Float

Optional

Which sampling temperature should be employed, ranging from 0 to 2? The default temperature is set at 1. Opting for higher values, such as 0.8, will increase randomness in the output, whereas lower values, like 0.2, will enhance focus and determinism.

top_p

Float

Optional

Nucleus sampling is an alternative to temperature sampling in which the model evaluates the outcomes of tokens with top p probability mass. So 0.1 indicates that only the top 10% probability mass tokens will be evaluated.

Default value: 1

response_format

object

Optional

This is an object that specifies the format of the output. This parameter is compatible with the newer models.

Setting to { "type" : "json_object" } will guarantee a valid JSON object.

Response fields

The response is a JSON object. Some essential attributes are given below:

Response fields for chat completions API call

Fields

Format

Description

id

String

This is a unique ID for the chat completion.

choices

Array

It is an array of objects. Every object contains valuable information about the response. The size of the array will be equal to the n parameter that we provided in the request parameters.

n is used to determine the number of chat completion choices to generate for each input message.

Among the response parameters, the choices parameter contains the API-generated output. Let’s look at the choices array to understand its structure and contents.

Fields

Format

Description

choices[i].finish_reason

String

This provides the reason the model stopped generating tokens. Here are the reasons sent:

stop: A natural stop point.

length: Number of tokens specified in the request reached.

content_filter: Content omitted due to a flag generated by a filter.

tool_calls: Model calls a tool.

choices[i].index

Integer

This gives the index of the choice in a list of choices.

This API's input is an array comprising the request text message and the system role. Similarly, the output is also an array consisting of a role and the text message response.

Completions: input and output model
Completions: input and output model

The chat completions API operates with an array as both input and output. By providing relevant texts within the array, we can direct it to take specific actions. A carefully crafted prompt will yield favorable output.

Code example

Let's utilize the chat completions API to generate content about "Large language models." In the code widget below, we'll employ the gpt-4o-mini model for this task, with a temperature value of 0.8.

Note: Before running the code, generate an API keyhttps://platform.openai.com/api-keys and replace {{SECRET_KEY}} with the generated API key in the code below.

from openai import OpenAI
client = OpenAI(api_key="{{SECRET_KEY}}")
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant designed to output JSON."},
{"role": "user", "content": "Write a tagline about large language models."}
],
temperature = 0.8
)
print(response.choices[0].message.content)

The code initializes an OpenAI client using a provided API key, then sends a request to generate a chat completion. It specifies the gpt-4o-mini model, includes both system and user messages, and sets the temperature parameter to 0.8. Finally, it prints the generated text from the response.


Free Resources

Copyright ©2025 Educative, Inc. All rights reserved