ChatGPT is an AI chatbot launched by OpenAI in November 2022. It has multilingual language generation processing capabilities. There are many tasks in which we can leverage the multilingual capabilities of ChatGPT, such as:
Question answering
Style transfer
Vocabulary suggestion
ChatGPT is available in more than 50 languages. However, its performance in languages other than English is not as good.
Language | Country | Language | Country |
Albanian | Albania | Korean | South Korea |
Arabic | Multiple Countries | Latvian | Latvia |
Armenian | Armenia | Lithuanian | Lithuania |
Azerbaijani | Azerbaijan | Macedonian | North Macedonia |
Basque | Spain | Malay | Malaysia |
Belarusian | Belarus | Malayalam | India |
Bengali | Bangladesh, India | Marathi | India |
Bulgarian | Bulgaria | Mongolian | Mongolia |
Catalan | Spain, Andorra | Norwegian | Norway |
Chinese | Multiple countries | Persian | Iran |
Croatian | Croatia | Polish | Poland |
Czech | Czech Republic | Portuguese | Multiple Countries |
Danish | Denmark | Punjabi | India |
Dutch | Netherlands | Romanian | Romania |
English | Multiple Countries | Russian | Russia |
Estonian | Estonia | Serbian | Serbia |
Filipino | Philippines | Slovak | Slovakia |
Finnish | Finland | Slovenian | Slovenia |
French | Multiple Countries | Spanish | Multiple countries |
Galician | Spain | Swahili | Kenya, Tanzania |
Georgian | Georgia | Swedish | Sweden |
German | Multiple Countries | Tamil | India, Sri Lanka |
Greek | Greece, Cyprus | Telugu | India |
Gujarati | India | Thai | Thailand |
Hebrew | Israel | Turkish | Turkey |
Hindi | India | Ukrainian | Ukraine |
Hungarian | Hungary | Urdu | Pakistan, India |
Icelandic | Iceland | Uzbek | Uzbekistan |
Indonesian | Indonesia | Vietnamese | Vietnam |
Italian | Italy | Welsh | United Kingdom |
Japanese | Japan | Xhosa | South Africa |
Kannada | India | Yiddish | Multiple Countries |
Kazakh | Kazakhstan | Zulu | South Africa |
There are primarily two ways in which ChatGPT can work on multiple languages:
Replying in another language: ChatGPT replies in the language it is questioned.
Prompt engineering: If we append our prompt with the "Generate in
Let's investigate the multilingual characteristics of ChatGPT using the OpenAI API in Python. Replace the <add-API-key-here>
placeholder in the code with your OpenAI API key before running the code below.
Note: You can get an OpenAI API key by following the instructions in the Educative Answer, How to get API key for GPT-3. Please remember that if your trial with OpenAI API has expired, it will throw an error unless you purchase credits.
# -*- coding: utf-8 -*-from pprint import pprintimport globfrom openai import OpenAIimport tiktokendef num_tokens_from_string(string, encoding_name):encoding = tiktoken.get_encoding(encoding_name)num_tokens = len(encoding.encode(string))return num_tokensclient = OpenAI(# api_key defaults to os.environ.get("OPENAI_API_KEY")api_key="<add-API-key-here>",)max_tok = 200chat = [{"role": "system", "content": "You are an agricultural advisior."}]queries = ["جو میں نے کھاد اب ڈال دی ہے وہ اگر میں پانچ دن بعد پانی لگا کر برفین بیجتا ہوں تو کھاد کیا ضائع ہو جائے گی یا سہی کام کرے گی","What is the right time to irrigate the rice crop?","مکئی کی فصل ہے اسکو سپرے کرنا ہے۔ جو بیماری آجکل ایئ ہوئی ہے سنڈی والی ابھی چھوٹی ہے۔ بتاۓ اس کا کیا حل ہے۔ بیس دن ابھی ہویے ہے اسکو کاشت کئے ہوئے","When should I use fertilizer on my corn crop?"]print("===============QUESTIONING-IN-A-LANGUAGE===============")for idx, query in enumerate(queries):chat.append({"role":"user", "content":query})reply = client.chat.completions.create(model="gpt-3.5-turbo",messages=chat,max_tokens=max_tok)print("Stage",idx+1 )chat.append({"role": "assistant", "content": reply.choices[0].message.content})pprint(chat[2*idx+1])pprint(chat[2*idx+2])print("===============PROMPT-ENGINEERING===============")q_en = "How should I mitigate a pest attack on patato fields?"suffix = " Reply in Urdu language."q = q_en + suffixchat.append({"role":"user", "content":q})reply = client.chat.completions.create(model="gpt-3.5-turbo",messages=chat,max_tokens=max_tok)chat.append({"role":"assistant", "content": reply.choices[0].message.content})pprint(chat[-2])pprint(chat[-1])
Line 1: We set the encoding to properly display the non-ASCII characters in the Urdu script.
Lines 2–5: We import the relevant libraries.
Lines 7–10: We define the function for finding the number of tokens in a query/response.
Line 12–15: We create the client
object and set the OpenAI API key.
Line 17: We set the maximum number of tokens that will be generated.
Line 19: We use the system
identifier to personalize the assistant’s responses by assigning it a role.
Lines 20–25: We define the queries for a chat session. We alternate between English and Urdu queries to demonstrate that the GPT-3 model answers in the language of the prompt.
Line 27: We iterate through the queries, performing the following at each step:
Line 28: We append the query with the identifier user
to the chat
conversation list.
Lines 30–34: We use the create()
method of the client.chat.completions
module to generate a response.
Line 36: We append the response with the identifier assistant
to the chat
conversation list.
Lines 37–38: We print the user query and the system response.
Lines 40–53: We repeat the experiment, but this time get replies in Urdu by appending the the "Generate in Urdu"
instruction to the English prompt.
Free Resources