ext::ai
To activate EdgeDB AI functionality, you can use the extension mechanism:
using extension ai;
Configuration
Use the configure
command to set configuration for the AI extension. Update
the values using the configure session
or the configure current branch
command depending on the scope you prefer:
db> ...
configure current branch
set ext::ai::Config::indexer_naptime := <duration>'0:00:30';
OK: CONFIGURE DATABASE
The only property available currently is indexer_naptime
which specifies
the minimum delay between deferred ext::ai::index
indexer runs on any given
branch.
Examine the extensions
link of the cfg::Config
object to check the
current config values:
db>
select cfg::Config.extensions[is ext::ai::Config]{*};
{ ext::ai::Config { id: 1a53f942-d7ce-5610-8be2-c013fbe704db, indexer_naptime: <duration>'0:00:30' } }
You may also restore the default config value using configure session
reset
if you set it on the session or configure current branch reset
if you set it on the branch:
db>
configure current branch reset ext::ai::Config::indexer_naptime;
OK: CONFIGURE DATABASE
Providers
Provider configs are required for AI indexes (for embedding generation) and for RAG (for text generation). They may be added via edgedb ui or by via EdgeQL:
configure current database
insert ext::ai::OpenAIProviderConfig {
secret := 'sk-....',
};
The extension makes available types for each provider and for a custom provider compatible with one of the supported API styles.
-
ext::ai::OpenAIProviderConfig
-
ext::ai::MistralProviderConfig
-
ext::ai::AnthropicProviderConfig
-
ext::ai::CustomProviderConfig
All provider types require the secret
property be set with a string
containing the secret provided by the AI vendor. Other properties may
optionally be set:
-
name
- A unique provider name -
display_name
- A human-friendly provider name -
api_url
- The provider’s API URL -
client_id
- ID for the client provided by model API vendor
In addition to the required secret
property,
ext::ai::CustomProviderConfig requires an ``api_style
property be set.
Available values are ext::ai::ProviderAPIStyle.OpenAI
and
ext::ai::ProviderAPIStyle.Anthropic
.
Prompts
You may add prompts either via edgedb ui or via EdgeQL. Here’s an example of how you might add a prompt with a single message:
insert ext::ai::ChatPrompt {
name := 'test-prompt',
messages := (
insert ext::ai::ChatPromptMessage {
participant_role := ext::ai::ChatParticipantRole.System,
content := "Your message content"
}
)
};
participant_role
may be any of these values:
-
ext::ai::ChatParticipantRole.System
-
ext::ai::ChatParticipantRole.User
-
ext::ai::ChatParticipantRole.Assistant
-
ext::ai::ChatParticipantRole.Tool
ext::ai::ChatPromptMessage
also has a participant_name
property which
is an optional str
.
Index
The ext::ai::index
creates a deferred semantic similarity index of an
expression on a type.
module default {
type Astronomy {
content: str;
deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
on (.content);
}
};
It can accept several named arguments:
-
embedding_model
- The name of the model to use for embedding generation as a string.You may use any of these pre-configured embedding generation models:
OpenAI
-
text-embedding-3-small
-
text-embedding-3-large
-
text-embedding-ada-002
Learn more about the OpenAI embedding models
Mistral
-
mistral-embed
-
-
distance_function
- The function to use for determining semantic similarity. Default:ext::ai::DistanceFunction.Cosine
The distance function may be any of these:
-
ext::ai::DistanceFunction.Cosine
-
ext::ai::DistanceFunction.InnerProduct
-
ext::ai::DistanceFunction.L2
-
-
index_type
- The type of index to create. Currently the only option is the default:ext::ai::IndexType.HNSW
. -
index_parameters
- A named tuple of additional index parameters:-
m
- The maximum number of edges of each node in the graph. Increasing can increase the accuracy of searches at the cost of index size. Default:32
-
ef_construction
- Dictates the depth and width of the search when building the index. Higher values can lead to better connections and more accurate results at the cost of time and resource usage when building the index. Default:100
-
When indexes aren’t working…
If you find your queries are not returning the expected results, try
inspecting your instance logs. On an EdgeDB Cloud instance, use the “Logs”
tab in your instance dashboard. On local or CLI-linked remote
instances, use edgedb instance logs -I
<instance-name>
. You may find the problem there.
Providers impose rate limits on their APIs which can often be the source of
AI index problems. If index creation hits a rate limit, EdgeDB will wait
the indexer_naptime
(see the docs on ext::ai configuration) and resume index creation.
If your indexed property contains values that exceed the token limit for a single request, you may consider truncating the property value in your index expression. You can do this with a string by slicing it:
module default {
type Astronomy {
content: str;
deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
on (.content[0:10000]);
}
};
This example will slice the first 10,000 characters of the content
property for indexing.
Tokens are not equivalent to characters. For OpenAI embedding generation, you may test values via OpenAI’s web-based tokenizer. You may alternatively download the library OpenAI uses for tokenization from that same page if you prefer. By testing, you can get an idea how much of your content can be sent for indexing.
Functions
Evaluates the expression of an ai::index on the passed object and returns it. | |
Search an object using its ai::index index. |
Evaluates the expression of an ai::index on the passed object and returns it.
This can be useful for confirming the basis of embedding generation for a particular object or type.
Given this schema:
module default {
type Astronomy {
topic: str;
content: str;
deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
on (.topic ++ ' ' ++ .content);
}
};
and with these inserts:
db> ... ... ...
insert Astronomy {
topic := 'Mars',
content := 'Skies on Mars are red.'
}
db> ... ... ...
insert Astronomy {
topic := 'Earth',
content := 'Skies on Earth are blue.'
}
to_context
returns these results:
db>
select ext::ai::to_context(Astronomy);
{'Mars Skies on Mars are red.', 'Earth Skies on Earth are blue.'}
db>
select ext::ai::to_context((select Astronomy limit 1));
{'Mars Skies on Mars are red.'}
Search an object using its ai::index index.
Returns objects that match the specified semantic query and the similarity score.
The query
argument should not be a textual query but the
embeddings generated from a textual query. To have EdgeDB generate
the query for you along with a text response, try our built-in
RAG.
db> ...
with query := <array<float32>><json>$query
select ext::ai::search(Knowledge, query);
{ ( object := default::Knowledge {id: 9af0d0e8-0880-11ef-9b6b-4335855251c4}, distance := 0.20410746335983276 ), ( object := default::Knowledge {id: eeacf638-07f6-11ef-b9e9-57078acfce39}, distance := 0.7843298847773637 ), ( object := default::Knowledge {id: f70863c6-07f6-11ef-b9e9-3708318e69ee}, distance := 0.8560434728860855 ), }
HTTP endpoints
Use the AI extension’s HTTP endpoints to perform retrieval-augmented generation using your AI indexes or to generate embeddings against a model of your choice.
All EdgeDB server HTTP endpoints require authentication. By default, you may use HTTP Basic Authentication with your EdgeDB username and password.
RAG
POST
: https://<edgedb-host>:<port>/branch/<branch-name>/ai/rag
Responds with text generated by the specified text generation model in response to the provided query.
Request
Make a POST
request to the endpoint with a JSON body. The body may have
these properties:
-
model
(string, required): The name of the text generation model to use.You may use any of these text generation models:
OpenAI
-
gpt-3.5-turbo
-
gpt-4-turbo-preview
Learn more about the OpenAI text generation models
Mistral
-
mistral-small-latest
-
mistral-medium-latest
-
mistral-large-latest
Learn more about the Mistral text generation models
Anthropic
-
claude-3-haiku-20240307
-
claude-3-sonnet-20240229
-
claude-3-opus-20240229
-
-
query
(string, required): The query string use as the basis for text generation. -
context
(object, required): Settings that define the context of the query.-
query
(string, required): Specifies an expression to determine the relevant objects and index to serve as context for text generation. You may set this to any expression that produces a set of objects, even if it is not a standalone query. -
variables
(object, optional): A dictionary of variables for use in the context query. -
globals
(object, optional): A dictionary of globals for use in the context query. -
max_object_count
(int, optional): Maximum number of objects to return; default is 5.
-
-
stream
(boolean, optional): Specifies whether the response should be streamed. Defaults to false. -
prompt
(object, optional): Settings that define a prompt. Omit to use the default prompt.You may specify an existing prompt by its
name
orid
, you may define a custom prompt inline by sending an array of objects, or you may do both to augment an existing prompt with additional custom messages.-
name
(string, optional) orid
(string, optional): Thename
orid
of an existing custom prompt to use. Provide only one of these if you want to use or start from an existing prompt. -
custom
(array of objects, optional): Custom prompt messages, each containing arole
andcontent
. If noname
orid
was provided, the custom messages provided here become the prompt. If one of those was provided, these messages will be added to that existing prompt.
-
Example request
curl --user <username>:<password> --json '{
"query": "What color is the sky on Mars?",
"model": "gpt-4-turbo-preview",
"context": {"query":"Knowledge"}
}' http://<edgedb-host>:<port>/branch/main/ai/rag
Response
Example successful response
-
HTTP status: 200 OK
-
Content-Type: application/json
-
Body:
Copy{"response": "The sky on Mars is red."}
Example error response
-
HTTP status: 400 Bad Request
-
Content-Type: application/json
-
Body:
Copy{ "message": "missing required 'query' in request 'context' object", "type": "BadRequestError" }
Streaming response (SSE)
When the stream
parameter is set to true
, the server uses Server-Sent
Events
(SSE) to stream responses. Here is a detailed breakdown of the typical
sequence and structure of events in a streaming response:
-
HTTP Status: 200 OK
-
Content-Type: text/event-stream
-
Cache-Control: no-cache
The stream consists of a sequence of five events, each encapsulating part of the response in a structured format:
-
Message start
-
Event type:
message_start
-
Data: Starts a message, specifying identifiers and roles.
Copy{ "type": "message_start", "message": { "id": "<message_id>", "role": "assistant", "model": "<model_name>" } }
-
-
Content block start
-
Event type:
content_block_start
-
Data: Marks the beginning of a new content block.
Copy{ "type": "content_block_start", "index": 0, "content_block": { "type": "text", "text": "" } }
-
-
Content block delta
-
Event type:
content_block_delta
-
Data: Incrementally updates the content, appending more text to the message.
Copy{ "type": "content_block_delta", "index": 0, "delta": { "type": "text_delta", "text": "The" } }
Subsequent
content_block_delta
events add more text to the message. -
-
Content block stop
-
Event type:
content_block_stop
-
Data: Marks the end of a content block.
Copy{ "type": "content_block_stop", "index": 0 }
-
-
Message stop
-
Event type:
message_stop
-
Data: Marks the end of the message.
Copy{"type": "message_stop"}
-
Each event is sent as a separate SSE message, formatted as shown above. The connection is closed after all events are sent, signaling the end of the stream.
Example SSE response
event: message_start
data: {"type": "message_start", "message": {"id": "chatcmpl-9MzuQiF0SxUjFLRjIdT3mTVaMWwiv", "role": "assistant", "model": "gpt-4-0125-preview"}}
event: content_block_start
data: {"type": "content_block_start","index":0,"content_block":{"type":"text","text":""}}
event: content_block_delta
data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": "The"}}
event: content_block_delta
data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " skies"}}
event: content_block_delta
data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " on"}}
event: content_block_delta
data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " Mars"}}
event: content_block_delta
data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " are"}}
event: content_block_delta
data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " red"}}
event: content_block_delta
data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": "."}}
event: content_block_stop
data: {"type": "content_block_stop","index":0}
event: message_delta
data: {"type": "message_delta", "delta": {"stop_reason": "stop"}}
event: message_stop
data: {"type": "message_stop"}
Embeddings
POST
: https://<edgedb-host>:<port>/branch/<branch-name>/ai/embeddings
Responds with embeddings generated by the specified embeddings model in response to the provided input.
Request
Make a POST
request to the endpoint with a JSON body. The body may have
these properties:
-
input
(array of strings or a single string, required): The text to use as the basis for embeddings generation. -
model
(string, required): The name of the embedding model to use. You may use any of the supported embedding models.
Example request
curl --user <username>:<password> --json '{
"input": "What color is the sky on Mars?",
"model": "text-embedding-3-small"
}' http://localhost:10931/branch/main/ai/embeddings
Response
Example successful response
-
HTTP status: 200 OK
-
Content-Type: application/json
-
Body:
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [-0.009434271, 0.009137661]
}
],
"model": "text-embedding-3-small",
"usage": {
"prompt_tokens": 8,
"total_tokens": 8
}
}
The embedding
property is shown here with only two values for brevity,
but an actual response would contain many more values.
Example error response
-
HTTP status: 400 Bad Request
-
Content-Type: application/json
-
Body:
Copy{ "message": "missing or empty required \"model\" value in request", "type": "BadRequestError" }