Azure chat completions example (preview)

OpenAI LogoOpenAI LogoOpenAI Logo
Christian Mürtz, Gerardo Lecaros, Krista Pratico
Mar 28, 2023
Open in Github

In this example we'll try to go over all operations needed to get chat completions working using the Azure endpoints.
This example focuses on chat completions but also touches on some other operations that are also available using the API. This example is meant to be a quick way of showing simple operations and is not meant as a tutorial.

import os
import openai

Setup

For the following sections to work properly we first have to setup some things. Let's start with the api_base and api_version. To find your api_base go to https://portal.azure.com, find your resource and then under "Resource Management" -> "Keys and Endpoints" look for the "Endpoint" value.

openai.api_version = '2023-05-15'
openai.api_base = '' # Please add your endpoint here

We next have to setup the api_type and api_key. We can either get the key from the portal or we can get it through Microsoft Active Directory Authentication. Depending on this the api_type is either azure or azure_ad.

Setup: Portal

Let's first look at getting the key from the portal. Go to https://portal.azure.com, find your resource and then under "Resource Management" -> "Keys and Endpoints" look for one of the "Keys" values.

openai.api_type = 'azure'
openai.api_key = os.environ["OPENAI_API_KEY"]

Note: In this example, we configured the library to use the Azure API by setting the variables in code. For development, consider setting the environment variables instead:

OPENAI_API_BASE
OPENAI_API_KEY
OPENAI_API_TYPE
OPENAI_API_VERSION
# from azure.identity import DefaultAzureCredential

# default_credential = DefaultAzureCredential()
# token = default_credential.get_token("https://cognitiveservices.azure.com/.default")

# openai.api_type = 'azure_ad'
# openai.api_key = token.token

A token is valid for a period of time, after which it will expire. To ensure a valid token is sent with every request, you can refresh an expiring token by hooking into requests.auth:

import typing
import time
import requests
if typing.TYPE_CHECKING:
    from azure.core.credentials import TokenCredential

class TokenRefresh(requests.auth.AuthBase):

    def __init__(self, credential: "TokenCredential", scopes: typing.List[str]) -> None:
        self.credential = credential
        self.scopes = scopes
        self.cached_token: typing.Optional[str] = None

    def __call__(self, req):
        if not self.cached_token or self.cached_token.expires_on - time.time() < 300:
            self.cached_token = self.credential.get_token(*self.scopes)
        req.headers["Authorization"] = f"Bearer {self.cached_token.token}"
        return req

session = requests.Session()
session.auth = TokenRefresh(default_credential, ["https://cognitiveservices.azure.com/.default"])

openai.requestssession = session

Deployments

In this section we are going to create a deployment using the gpt-35-turbo model that we can then use to create chat completions.

deployment_id = '' # Fill in the deployment id from the portal here
# For all possible arguments see https://platform.openai.com/docs/api-reference/chat-completions/create
response = openai.ChatCompletion.create(
    deployment_id=deployment_id,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Knock knock."},
        {"role": "assistant", "content": "Who's there?"},
        {"role": "user", "content": "Orange."},
    ],
    temperature=0,
)

print(f"{response.choices[0].message.role}: {response.choices[0].message.content}")

We can also stream the response.

response = openai.ChatCompletion.create(
    deployment_id=deployment_id,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Knock knock."},
        {"role": "assistant", "content": "Who's there?"},
        {"role": "user", "content": "Orange."},
    ],
    temperature=0,
    stream=True
)

for chunk in response:
    delta = chunk.choices[0].delta

    if "role" in delta.keys():
        print(delta.role + ": ", end="", flush=True)
    if "content" in delta.keys():
        print(delta.content, end="", flush=True)