What you'll build

A semantic search engine that understands the MEANING of what you're looking for, not just keywords.

Imagine searching for "domestic animal resting" and finding documents about "the cat sleeps on the sofa". That's vector search: it converts text into mathematical vectors and finds conceptual similarities.

When finished, you'll have a Python system with ChromaDB that indexes documents, creates embeddings with Gemini, and allows searching by meaning. Perfect for searching FAQs, articles, or any text collection.

The prompt to start

Create a vector search system in Python with:

ChromaDB as vector database

Gemini embeddings (free)

Function to add documents

Function to search similar

Show similarity score

What the AI will create

import chromadb
from chromadb.utils import embedding_functions
import os

# Configure embeddings with Gemini
gemini_ef = embedding_functions.GoogleGenerativeAiEmbeddingFunction(
    api_key=os.environ["GEMINI_API_KEY"],
    model_name="models/embedding-001"
)

# Create client and collection
client = chromadb.PersistentClient(path="./vector_db")
collection = client.get_or_create_collection(
    name="documents",
    embedding_function=gemini_ef,
    metadata={"hnsw:space": "cosine"}
)

def add_documents(docs: list[dict]):
    """Add documents with metadata"""
    collection.add(
        documents=[d["text"] for d in docs],
        metadatas=[{"source": d.get("source", "unknown")} for d in docs],
        ids=[f"doc_{i}" for i in range(len(docs))]
    )

def search(query: str, n_results: int = 5):
    """Search similar documents"""
    results = collection.query(
        query_texts=[query],
        n_results=n_results,
        include=["documents", "distances", "metadatas"]
    )

    for i, (doc, dist, meta) in enumerate(zip(
        results["documents"][0],
        results["distances"][0],
        results["metadatas"][0]
    )):
        similarity = 1 - dist  # Convert distance to similarity
        print(f"{i+1}. [{similarity:.2%}] {doc[:100]}...")
        print(f"   Source: {meta['source']}")

# Example
docs = [
    {"text": "The cat sleeps on the sofa", "source": "pets.txt"},
    {"text": "Python is a programming language", "source": "tech.txt"},
    {"text": "My dog runs in the park", "source": "pets.txt"},
    {"text": "JavaScript is used for web", "source": "tech.txt"},
]

add_documents(docs)
search("domestic animal resting")
# → Finds "The cat sleeps on the sofa" with high similarity

SQL vs Vector comparison

Traditional SQL	Vector search
`LIKE '%cat%'`	Searches literal "cat"
Only exact matches	Understands synonyms
No context	"pet" → "cat", "dog"

Next level

→ Custom MCP Server

Vector Search

📋 Suggested prerequisites

What you'll build

The prompt to start

What the AI will create

SQL vs Vector comparison

Next level