April 25, 2024

EdgeDB 5: Introducing ext::ai

Elvis Pranskevichus@elprans

Yury Selivanov@1st1

Back in EdgeDB 3.0 we, added support for storing and searching embeddings via the ext::pgvector extension .

However, we felt that we can do much more than just simple vector storage and search. In EdgeDB, traditional full text search is as easy as adding an fts::index on the object of interest like so:

Copy

type BlogPost {
  content: str;
  index fts::index on (
    fts::with_options(
      .content,
      language := fts::Language.eng
    )
  );
}

and then do search like so:

Copy

select fts::search(BlogPost, 'my query')

We are happy to announce that in EdgeDB 5.0 indexing and searching content using semantic similarity is just as easy!

Copy

using extension ai;

type default::BlogPost {
  content: str;
  deferred index ext::ai::index(
    embedding_model := 'text-embedding-3-small'
  ) on (.content);
}

No more fiddling with embeddings! Just declare an index on a text property (or any text expression), and you’re now ready to easily perform semantic similarity searches:

Copy

select ext::ai::search(BlogPost, vector)

This works thanks to our new deferred index mechanism, also added in EdgeDB 5.0. It allows for asynchronous index creation to avoid blocking object mutation — the perfect solution for slow-going operations such as calling out to a remote LLM over an API.

Speaking of the API, the ext::ai extension contains support for calling into OpenAI, Mistral and Anthropic model APIs out of the box. Of course, you can plug any model as long as it exposes an OpenAI- or Anthropic-compatible API (and more support is coming in future releases).

Having a generic way of doing semantic search over an arbitrary set of objects opens the door to another great feature of the ext::ai extension: RAG with database data as context!

RAGs to riches

Code speaks for itself, so here’s all you need to add a database-powered RAG to your app:

Copy

import { createClient } from "edgedb";
import { createAI } from "@edgedb/ai";

const client = createClient();

const gpt4AI = createAI(client, {
  model: "gpt-4-turbo-preview",
});

const blogAI = gpt4AI.withContext({
  query: "select BlogPost"
});

console.log(await blogAI.queryRag(
  "Were any of the blog posts about RAG?"
));

The new @edgedb/ai JavaScript package provides a convenient wrapper for the ext::ai HTTP API, which, of course, can be used directly if that better fits your needs:

Copy

$ curl --json '{
    "query": "Were any of the blog posts about RAG?",
    "model": "gpt-4-turbo-preview",
    "context": {"query":"select BlogPost"},
    "stream": true
  }' https://edgedb-host:port/branch/main/ai/rag

Most generative LLMs are quite slow, so good UX demands support for output streaming, which can be requested by passing "stream": true like in the example above.

Don’t forget to authenticate your HTTP request!