Using embeddings

OpenAI LogoOpenAI LogoOpenAI Logo
Boris Power, Ted Sanders, Logan Kilpatrick
Mar 10, 2022
Open in Github

This notebook contains some helpful snippets you can use to embed text with the 'text-embedding-ada-002' model via the OpenAI API.

import openai

embedding = openai.Embedding.create(
    input="Your text goes here", model="text-embedding-ada-002"
)["data"][0]["embedding"]
len(embedding)
1536

It's recommended to use the 'tenacity' package or another exponential backoff implementation to better manage API rate limits, as hitting the API too much too fast can trigger rate limits. Using the following function ensures you get your embeddings as fast as possible.

# Negative example (slow and rate-limited)
import openai

num_embeddings = 10000 # Some large number
for i in range(num_embeddings):
    embedding = openai.Embedding.create(
        input="Your text goes here", model="text-embedding-ada-002"
    )["data"][0]["embedding"]
    print(len(embedding))
# Best practice
import openai
from tenacity import retry, wait_random_exponential, stop_after_attempt

# Retry up to 6 times with exponential backoff, starting at 1 second and maxing out at 20 seconds delay
@retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(6))
def get_embedding(text: str, model="text-embedding-ada-002") -> list[float]:
    return openai.Embedding.create(input=[text], model=model)["data"][0]["embedding"]

embedding = get_embedding("Your text goes here", model="text-embedding-ada-002")
print(len(embedding))
1536