I thought I’d quickly show something which may come in handy for the community…
You CAN implement ChromaDB as a vector store directly in your anvil app.
First, switch your python version and add the following packages (you’ll need both for this to work):
chromadb
pysqllite3-binary
Once that’s built, in a server module you’ll need the following import statements:
__import__('pysqlite3')
import sys
sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')
import chromadb
from chromadb.utils import embedding_functions
from chromadb.config import Settings
chroma_client = chromadb.Client(Settings(allow_reset=True))
You can then easily create and manage your collections in your sessions, even using authenticated calls (with the (require_user=True) parameter on your server callables). In this example I’m using the user ID as the collection name for the duration of the session, to give me fine grained control over the data going into a RAG model. It’s fairly self explanatory, noting that you have to get the formatting right for the collection name of you’ll hit errors:
@authenticated_call
def create_collection():
raw_id = anvil.users.get_user().get_id()
sanitized_id = raw_id.replace("[", "").replace("]", "").replace(",", "_")
collection_name = f"user_{sanitized_id}_session"
existing_collections = chroma_client.list_collections()
existing_collection_names = [col.name for col in existing_collections]
if collection_name in existing_collection_names:
collection = chroma_client.create_collection(name=collection_name)
@authenticated_call
def delete_collection():
raw_id = anvil.users.get_user().get_id()
sanitized_id = raw_id.replace("[", "").replace("]", "").replace(",", "_")
collection_name = f"user_{sanitized_id}_session"
existing_collections = chroma_client.list_collections()
existing_collection_names = [col.name for col in existing_collections]
if collection_name in existing_collection_names:
chroma_client.delete_collection(name=collection_name)
chroma_client.reset()
And that’s it really.
The Chroma docs are decent, but this guide is pretty damn nifty.
I’m using it with Llama-index and Ollama, and it’s ludicrously easy to get to grips with.
Hopefully knowing that we can do this without the need for Pinecone or other blackbox deps will also come in useful to someone else.