7.7.5. OPENAI-05 — Embeddings and Models

This tutorial covers text embeddings (and comparing them), listing and retrieving models, and why model ids are percent-encoded in the URL path.

7.7.5.1. Embeddings and cosine similarity

embed turns a string into a vector of floats. Semantically similar texts produce vectors that point in similar directions — measured by cosine similarity (1.0 = identical direction, 0.0 = orthogonal):

require openai/openai_embeddings
require math

let v1 = embed(client, "text-embedding-3-small", "a cat sat on the mat")
let v2 = embed(client, "text-embedding-3-small", "the feline rested on the rug")

def cosine(a, b : array<float>) : float {
    var dot = 0.0; var na = 0.0; var nb = 0.0
    for (i in range(length(a))) {
        dot += a[i] * b[i]; na += a[i] * a[i]; nb += b[i] * b[i]
    }
    return dot / (sqrt(na) * sqrt(nb))
}
print("similarity: {cosine(v1, v2)}\n")

For the full EmbeddingResponse (per-input vectors + usage), use embeddings(client, req) with an EmbeddingRequest.

7.7.5.2. Listing and retrieving models

require openai/openai_models

let list = list_models(client)
for (m in list.models) {
    print("  {m.id} (owned_by {m.owned_by})\n")
}

let one = retrieve_model(client, "gpt-4o-mini")
if (one.ok) {
    print("retrieved: {one.model.id}\n")
}

7.7.5.3. Percent-encoded model ids

Many providers use model ids with slashes (e.g. google/gemma-2:free). retrieve_model percent-encodes the id so the / is not parsed as extra path segments. percent_encode (in openai_common) is the helper that does it:

require openai/openai_common
print("{percent_encode("google/gemma-2:free")}\n")
// → google%2Fgemma-2%3Afree

7.7.5.4. Quick Reference

Function

Description

embed(client, model, text)

Single string → vector (array<float>)

embeddings(client, req)

Full EmbeddingResult (vectors + usage)

list_models(client)

All available models

retrieve_model(client, id)

One model by id (path-encoded)

percent_encode(s)

URL-encode a path segment