Search
ctrl/
Ask AI
Light
Dark
System

ext::ai

To activate EdgeDB AI functionality, you can use the extension mechanism:

Copy
using extension ai;

Use the configure command to set configuration for the AI extension. Update the values using the configure session or the configure current branch command depending on the scope you prefer:

Copy
db> 
... 
configure current branch
set ext::ai::Config::indexer_naptime := <duration>'0:00:30';
OK: CONFIGURE DATABASE

The only property available currently is indexer_naptime which specifies the minimum delay between deferred ext::ai::index indexer runs on any given branch.

Examine the extensions link of the cfg::Config object to check the current config values:

Copy
db> 
select cfg::Config.extensions[is ext::ai::Config]{*};
{
  ext::ai::Config {
    id: 1a53f942-d7ce-5610-8be2-c013fbe704db,
    indexer_naptime: <duration>'0:00:30'
  }
}

You may also restore the default config value using configure session reset if you set it on the session or configure current branch reset if you set it on the branch:

Copy
db> 
configure current branch reset ext::ai::Config::indexer_naptime;
OK: CONFIGURE DATABASE

Provider configs are required for AI indexes (for embedding generation) and for RAG (for text generation). They may be added via edgedb ui or by via EdgeQL:

Copy
configure current database
insert ext::ai::OpenAIProviderConfig {
  secret := 'sk-....',
};

The extension makes available types for each provider and for a custom provider compatible with one of the supported API styles.

  • ext::ai::OpenAIProviderConfig

  • ext::ai::MistralProviderConfig

  • ext::ai::AnthropicProviderConfig

  • ext::ai::CustomProviderConfig

All provider types require the secret property be set with a string containing the secret provided by the AI vendor. Other properties may optionally be set:

  • name- A unique provider name

  • display_name- A human-friendly provider name

  • api_url- The provider’s API URL

  • client_id- ID for the client provided by model API vendor

In addition to the required secret property, ext::ai::CustomProviderConfig requires an ``api_style property be set. Available values are ext::ai::ProviderAPIStyle.OpenAI and ext::ai::ProviderAPIStyle.Anthropic.

You may add prompts either via edgedb ui or via EdgeQL. Here’s an example of how you might add a prompt with a single message:

Copy
insert ext::ai::ChatPrompt {
  name := 'test-prompt',
  messages := (
    insert ext::ai::ChatPromptMessage {
      participant_role := ext::ai::ChatParticipantRole.System,
      content := "Your message content"
    }
  )
};

participant_role may be any of these values:

  • ext::ai::ChatParticipantRole.System

  • ext::ai::ChatParticipantRole.User

  • ext::ai::ChatParticipantRole.Assistant

  • ext::ai::ChatParticipantRole.Tool

ext::ai::ChatPromptMessage also has a participant_name property which is an optional str.

The ext::ai::index creates a deferred semantic similarity index of an expression on a type.

Copy
module default {
  type Astronomy {
    content: str;
    deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
      on (.content);
  }
};

It can accept several named arguments:

  • embedding_model- The name of the model to use for embedding generation as a string.

    You may use any of these pre-configured embedding generation models:

    OpenAI

    • text-embedding-3-small

    • text-embedding-3-large

    • text-embedding-ada-002

    Learn more about the OpenAI embedding models

    Mistral

    • mistral-embed

    Learn more about the Mistral embedding model

  • distance_function- The function to use for determining semantic similarity. Default: ext::ai::DistanceFunction.Cosine

    The distance function may be any of these:

    • ext::ai::DistanceFunction.Cosine

    • ext::ai::DistanceFunction.InnerProduct

    • ext::ai::DistanceFunction.L2

  • index_type- The type of index to create. Currently the only option is the default: ext::ai::IndexType.HNSW.

  • index_parameters- A named tuple of additional index parameters:

    • m- The maximum number of edges of each node in the graph. Increasing can increase the accuracy of searches at the cost of index size. Default: 32

    • ef_construction- Dictates the depth and width of the search when building the index. Higher values can lead to better connections and more accurate results at the cost of time and resource usage when building the index. Default: 100

If you find your queries are not returning the expected results, try inspecting your instance logs. On an EdgeDB Cloud instance, use the “Logs” tab in your instance dashboard. On local or CLI-linked remote instances, use edgedb instance logs -I <instance-name>. You may find the problem there.

Providers impose rate limits on their APIs which can often be the source of AI index problems. If index creation hits a rate limit, EdgeDB will wait the indexer_naptime (see the docs on ext::ai configuration) and resume index creation.

If your indexed property contains values that exceed the token limit for a single request, you may consider truncating the property value in your index expression. You can do this with a string by slicing it:

Copy
module default {
  type Astronomy {
    content: str;
    deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
      on (.content[0:10000]);
  }
};

This example will slice the first 10,000 characters of the content property for indexing.

Tokens are not equivalent to characters. For OpenAI embedding generation, you may test values via OpenAI’s web-based tokenizer. You may alternatively download the library OpenAI uses for tokenization from that same page if you prefer. By testing, you can get an idea how much of your content can be sent for indexing.

ext::ai::to_context()

Evaluates the expression of an ai::index on the passed object and returns it.

ext::ai::search()

Search an object using its ai::index index.

function

ext::ai::to_context()
ext::ai::to_context(object: anyobject) -> str

Evaluates the expression of an ai::index on the passed object and returns it.

This can be useful for confirming the basis of embedding generation for a particular object or type.

Given this schema:

Copy
module default {
  type Astronomy {
    topic: str;
    content: str;
    deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
      on (.topic ++ ' ' ++ .content);
  }
};

and with these inserts:

Copy
db> 
... 
... 
... 
insert Astronomy {
  topic := 'Mars',
  content := 'Skies on Mars are red.'
}
Copy
db> 
... 
... 
... 
insert Astronomy {
  topic := 'Earth',
  content := 'Skies on Earth are blue.'
}

to_context returns these results:

Copy
db> 
select ext::ai::to_context(Astronomy);
{'Mars Skies on Mars are red.', 'Earth Skies on Earth are blue.'}
Copy
db> 
select ext::ai::to_context((select Astronomy limit 1));
{'Mars Skies on Mars are red.'}

function

ext::ai::search()
ext::ai::search( object: anyobject, query: array<float32> ) -> optional tuple<object: anyobject, distance: float64>

Search an object using its ai::index index.

Returns objects that match the specified semantic query and the similarity score.

The query argument should not be a textual query but the embeddings generated from a textual query. To have EdgeDB generate the query for you along with a text response, try our built-in RAG.

Copy
db> 
... 
with query := <array<float32>><json>$query
  select ext::ai::search(Knowledge, query);
{
  (
    object := default::Knowledge {id: 9af0d0e8-0880-11ef-9b6b-4335855251c4},
    distance := 0.20410746335983276
  ),
  (
    object := default::Knowledge {id: eeacf638-07f6-11ef-b9e9-57078acfce39},
    distance := 0.7843298847773637
  ),
  (
    object := default::Knowledge {id: f70863c6-07f6-11ef-b9e9-3708318e69ee},
    distance := 0.8560434728860855
  ),
}

Use the AI extension’s HTTP endpoints to perform retrieval-augmented generation using your AI indexes or to generate embeddings against a model of your choice.

All EdgeDB server HTTP endpoints require authentication. By default, you may use HTTP Basic Authentication with your EdgeDB username and password.

POST: https://<edgedb-host>:<port>/branch/<branch-name>/ai/rag

Responds with text generated by the specified text generation model in response to the provided query.

Make a POST request to the endpoint with a JSON body. The body may have these properties:

  • model (string, required): The name of the text generation model to use.

    You may use any of these text generation models:

    OpenAI

    • gpt-3.5-turbo

    • gpt-4-turbo-preview

    Learn more about the OpenAI text generation models

    Mistral

    • mistral-small-latest

    • mistral-medium-latest

    • mistral-large-latest

    Learn more about the Mistral text generation models

    Anthropic

    • claude-3-haiku-20240307

    • claude-3-sonnet-20240229

    • claude-3-opus-20240229

    Learn more about the Athropic text generation models

  • query (string, required): The query string use as the basis for text generation.

  • context (object, required): Settings that define the context of the query.

    • query (string, required): Specifies an expression to determine the relevant objects and index to serve as context for text generation. You may set this to any expression that produces a set of objects, even if it is not a standalone query.

    • variables (object, optional): A dictionary of variables for use in the context query.

    • globals (object, optional): A dictionary of globals for use in the context query.

    • max_object_count (int, optional): Maximum number of objects to return; default is 5.

  • stream (boolean, optional): Specifies whether the response should be streamed. Defaults to false.

  • prompt (object, optional): Settings that define a prompt. Omit to use the default prompt.

    You may specify an existing prompt by its name or id, you may define a custom prompt inline by sending an array of objects, or you may do both to augment an existing prompt with additional custom messages.

    • name (string, optional) or id (string, optional): The name or id of an existing custom prompt to use. Provide only one of these if you want to use or start from an existing prompt.

    • custom (array of objects, optional): Custom prompt messages, each containing a role and content. If no name or id was provided, the custom messages provided here become the prompt. If one of those was provided, these messages will be added to that existing prompt.

Example request

curl --user <username>:<password> --json '{
  "query": "What color is the sky on Mars?",
  "model": "gpt-4-turbo-preview",
  "context": {"query":"Knowledge"}
}' http://<edgedb-host>:<port>/branch/main/ai/rag

Example successful response

  • HTTP status: 200 OK

  • Content-Type: application/json

  • Body:

    Copy
    {"response": "The sky on Mars is red."}

Example error response

  • HTTP status: 400 Bad Request

  • Content-Type: application/json

  • Body:

    Copy
    {
      "message": "missing required 'query' in request 'context' object",
      "type": "BadRequestError"
    }

When the stream parameter is set to true, the server uses Server-Sent Events (SSE) to stream responses. Here is a detailed breakdown of the typical sequence and structure of events in a streaming response:

  • HTTP Status: 200 OK

  • Content-Type: text/event-stream

  • Cache-Control: no-cache

The stream consists of a sequence of five events, each encapsulating part of the response in a structured format:

  1. Message start

    • Event type: message_start

    • Data: Starts a message, specifying identifiers and roles.

    Copy
    {
      "type": "message_start",
      "message": {
        "id": "<message_id>",
        "role": "assistant",
        "model": "<model_name>"
      }
    }
  2. Content block start

    • Event type: content_block_start

    • Data: Marks the beginning of a new content block.

    Copy
    {
      "type": "content_block_start",
      "index": 0,
      "content_block": {
        "type": "text",
        "text": ""
      }
    }
  3. Content block delta

    • Event type: content_block_delta

    • Data: Incrementally updates the content, appending more text to the message.

    Copy
    {
      "type": "content_block_delta",
      "index": 0,
      "delta": {
        "type": "text_delta",
        "text": "The"
      }
    }

    Subsequent content_block_delta events add more text to the message.

  4. Content block stop

    • Event type: content_block_stop

    • Data: Marks the end of a content block.

    Copy
    {
      "type": "content_block_stop",
      "index": 0
    }
  5. Message stop

    • Event type: message_stop

    • Data: Marks the end of the message.

    Copy
    {"type": "message_stop"}

Each event is sent as a separate SSE message, formatted as shown above. The connection is closed after all events are sent, signaling the end of the stream.

Example SSE response

event: message_start
data: {"type": "message_start", "message": {"id": "chatcmpl-9MzuQiF0SxUjFLRjIdT3mTVaMWwiv", "role": "assistant", "model": "gpt-4-0125-preview"}}

event: content_block_start
data: {"type": "content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": "The"}}

event: content_block_delta
data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " skies"}}

event: content_block_delta
data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " on"}}

event: content_block_delta
data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " Mars"}}

event: content_block_delta
data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " are"}}

event: content_block_delta
data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " red"}}

event: content_block_delta
data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": "."}}

event: content_block_stop
data: {"type": "content_block_stop","index":0}

event: message_delta
data: {"type": "message_delta", "delta": {"stop_reason": "stop"}}

event: message_stop
data: {"type": "message_stop"}

POST: https://<edgedb-host>:<port>/branch/<branch-name>/ai/embeddings

Responds with embeddings generated by the specified embeddings model in response to the provided input.

Make a POST request to the endpoint with a JSON body. The body may have these properties:

  • input (array of strings or a single string, required): The text to use as the basis for embeddings generation.

  • model (string, required): The name of the embedding model to use. You may use any of the supported embedding models.

Example request

curl --user <username>:<password> --json '{
  "input": "What color is the sky on Mars?",
  "model": "text-embedding-3-small"
}' http://localhost:10931/branch/main/ai/embeddings

Example successful response

  • HTTP status: 200 OK

  • Content-Type: application/json

  • Body:

Copy
{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [-0.009434271, 0.009137661]
    }
  ],
  "model": "text-embedding-3-small",
  "usage": {
    "prompt_tokens": 8,
    "total_tokens": 8
  }
}

The embedding property is shown here with only two values for brevity, but an actual response would contain many more values.

Example error response

  • HTTP status: 400 Bad Request

  • Content-Type: application/json

  • Body:

    Copy
    {
      "message": "missing or empty required \"model\" value  in request",
      "type": "BadRequestError"
    }