@repo/db-upstash-vector

Vector database for semantic search and RAG (Retrieval-Augmented Generation). Store embeddings, search by meaning, build AI chat with context. Edge-compatible, works with OpenAI and Voyage embeddings.

Quick Start

Add semantic search in 10 minutes:

pnpm add @repo/db-upstash-vector

Store document embeddings, query by similarity, build RAG applications. Skip to Quick Start →

Why @repo/db-upstash-vector?

Keyword search doesn't understand meaning. Searching for "refund policy" won't find "money-back guarantee". AI chat needs relevant context from your docs, but fetching all documents is slow and expensive. Traditional vector databases require infrastructure management.

@repo/db-upstash-vector solves this with serverless vector search that works in edge runtimes.

Production-ready with embeddings generation, RAG helpers, document chunking, namespace isolation, and hybrid search.

Use cases

Semantic search — Find documents by meaning, not just keywords ("refund policy" matches "money-back guarantee")
RAG chatbots — Give AI relevant context from your docs for accurate answers
Recommendation engines — Find similar products, articles, or users based on embeddings
Duplicate detection — Identify near-duplicate content automatically
Question answering — Match questions to FAQ answers semantically

How it works

@repo/db-upstash-vector provides a vector database for storing and querying embeddings:

import { vectorIndex } from "@repo/db-upstash-vector";
import { generateEmbedding } from "@repo/db-upstash-vector/embeddings";

// Store document
const embedding = await generateEmbedding("Our 30-day money-back guarantee");
await vectorIndex.upsert({
  id: "doc_123",
  vector: embedding,
  metadata: { title: "Refund Policy" }
});

// Search by meaning
const queryEmbed = await generateEmbedding("refund policy");
const results = await vectorIndex.query({
  vector: queryEmbed,
  topK: 5
});
// Finds "30-day money-back guarantee" even though words don't match

Uses Upstash Vector (REST API for edge), OpenAI/Voyage embeddings, cosine similarity for matching.

Key features

Semantic search — Find documents by meaning with similarity scores (0-1)

RAG helpers — Document chunking, context retrieval, threshold filtering

Embeddings — Generate with OpenAI (text-embedding-3-small/large) or Voyage

Metadata filtering — Filter by type, date, category before similarity search

Namespaces — Multi-tenant isolation within single index

Edge-compatible — REST API works in serverless and edge runtimes

Quick Start

1. Install dependencies

pnpm add @repo/db-upstash-vector @upstash/vector

2. Configure Upstash Vector and embedding provider

.env.local

UPSTASH_VECTOR_REST_URL=https://xxx-vector.upstash.io
UPSTASH_VECTOR_REST_TOKEN=AXxx...
OPENAI_API_KEY=sk-...

3. Ingest a document

app/lib/ingest.ts

import { ingestDocument } from "@repo/db-upstash-vector/rag";

export async function ingestKnowledgeBase() {
  await ingestDocument({
    id: "refund-policy",
    content: "We offer a 30-day money-back guarantee. No questions asked.",
    metadata: {
      source: "policies",
      category: "refunds"
    },
    chunkSize: 500,
    chunkOverlap: 50
  });
}

4. Query for relevant context (RAG)

app/api/chat/route.ts

import { ragQuery } from "@repo/db-upstash-vector/rag";
import { generateText } from "ai";

export async function POST(req: Request) {
  const { message } = await req.json();

  // Get relevant context
  const context = await ragQuery({
    query: message,
    topK: 5,
    threshold: 0.7
  });

  // Generate response with context
  const response = await generateText({
    model: "claude-3-sonnet",
    system: `Answer based on: ${context}`,
    prompt: message
  });

  return Response.json({ response });
}

That's it! You now have semantic search and RAG-powered AI chat.

Hybrid search

Combine semantic and keyword search for best results:

import { hybridSearch } from "@repo/db-upstash-vector/rag";

const results = await hybridSearch({
  query: "machine learning",
  vectorWeight: 0.7, // 70% semantic
  keywordWeight: 0.3, // 30% keywords
  topK: 10
});

Technical Details

For Developers: Technical implementation details

Upstash Vector integration for semantic search, embeddings storage, and RAG (Retrieval-Augmented Generation) applications.

Installation

pnpm add @repo/db-upstash-vector

Overview

Property	Value
Location	`packages/db-upstash-vector`
Dependencies	`@upstash/vector`
Edge Compatible	Yes

Export Paths

Path	Description
`@repo/db-upstash-vector`	Vector client instance
`@repo/db-upstash-vector/embeddings`	Embedding utilities
`@repo/db-upstash-vector/rag`	RAG helpers

Basic Usage

Upsert Vectors

import { vectorIndex } from "@repo/db-upstash-vector";

// Upsert single vector
await vectorIndex.upsert({
  id: "doc_123",
  // highlight-start
  vector: [0.1, 0.2, 0.3, ...], // Your embedding
  metadata: {
    title: "Document Title",
    content: "Document content...",
    type: "article",
  },
  // highlight-end
});

// Upsert multiple vectors
await vectorIndex.upsert([
  { id: "doc_1", vector: [...], metadata: { title: "Doc 1" } },
  { id: "doc_2", vector: [...], metadata: { title: "Doc 2" } },
]);

Query Vectors

Similarity Search

Queries return results sorted by similarity score (0-1). Higher scores indicate more similar vectors.

import { vectorIndex } from "@repo/db-upstash-vector";

const results = await vectorIndex.query({
  // highlight-start
  vector: queryEmbedding, // Your query embedding
  topK: 10,
  includeMetadata: true,
  includeVectors: false
  // highlight-end
});

// Results structure
// [
//   { id: "doc_123", score: 0.95, metadata: { title: "..." } },
//   { id: "doc_456", score: 0.87, metadata: { title: "..." } },
// ]

Filter Queries

import { vectorIndex } from "@repo/db-upstash-vector";

const results = await vectorIndex.query({
  vector: queryEmbedding,
  topK: 5,
  // highlight-start
  filter: {
    type: { $eq: "article" },
    publishedAt: { $gt: "2024-01-01" }
  },
  // highlight-end
  includeMetadata: true
});

Delete Vectors

import { vectorIndex } from "@repo/db-upstash-vector";

// Delete by ID
await vectorIndex.delete("doc_123");

// Delete multiple
await vectorIndex.delete(["doc_1", "doc_2", "doc_3"]);

// Delete by filter
// highlight-start
await vectorIndex.deleteByFilter({
  type: { $eq: "draft" }
});
// highlight-end

Embeddings

Generate Embeddings

import { generateEmbedding, generateEmbeddings } from "@repo/db-upstash-vector/embeddings";

// Single embedding
const embedding = await generateEmbedding("Hello, world!");

// Multiple embeddings (batched)
// highlight-start
const embeddings = await generateEmbeddings(["First document", "Second document", "Third document"]);
// highlight-end

OpenAI Embeddings

Model Selection

text-embedding-3-small offers a good balance of quality and cost. Use text-embedding-3-large for higher accuracy when needed.

import { createOpenAIEmbedder } from "@repo/db-upstash-vector/embeddings";

const embedder = createOpenAIEmbedder({
  // highlight-start
  model: "text-embedding-3-small",
  dimensions: 1536
  // highlight-end
});

const embedding = await embedder.embed("Your text here");

Voyage Embeddings

import { createVoyageEmbedder } from "@repo/db-upstash-vector/embeddings";

const embedder = createVoyageEmbedder({
  model: "voyage-2"
});

const embedding = await embedder.embed("Your text here");

RAG (Retrieval-Augmented Generation)

Basic RAG Query

Threshold Tuning

Start with a threshold of 0.7 and adjust based on your use case. Lower values include more results but may reduce relevance.

import { ragQuery } from "@repo/db-upstash-vector/rag";

// highlight-start
const context = await ragQuery({
  query: "What is the refund policy?",
  topK: 5,
  threshold: 0.7 // Minimum similarity score
});
// highlight-end

// Use context with LLM
const response = await generateText({
  model: "claude-3-sonnet",
  system: `Answer based on this context:\n${context}`,
  prompt: "What is the refund policy?"
});

Document Ingestion

import { ingestDocument } from "@repo/db-upstash-vector/rag";

await ingestDocument({
  id: "doc_123",
  content: "Your document content here...",
  metadata: {
    source: "knowledge-base",
    category: "policies"
  },
  // highlight-start
  chunkSize: 500,
  chunkOverlap: 50
  // highlight-end
});

Chunk Text

Chunking Strategy

Use smaller chunks (300-500 chars) for precise retrieval, larger chunks (1000-2000 chars) for more context. Overlap helps maintain continuity.

import { chunkText } from "@repo/db-upstash-vector/rag";

const chunks = chunkText(longDocument, {
  // highlight-start
  size: 500, // Characters per chunk
  overlap: 50, // Overlap between chunks
  separator: "\n\n" // Prefer splitting at paragraphs
  // highlight-end
});

// chunks: ["First chunk...", "Second chunk...", ...]

Namespaces

Multi-Tenancy

Use namespaces to isolate data between tenants, environments, or document collections within a single index.

import { vectorIndex } from "@repo/db-upstash-vector";

// Use namespace for multi-tenant isolation
// highlight-next-line
const namespace = vectorIndex.namespace("tenant_123");

await namespace.upsert({
  id: "doc_1",
  vector: [...],
  metadata: { title: "Tenant Document" },
});

const results = await namespace.query({
  vector: queryEmbedding,
  topK: 5,
});

Hybrid Search

import { hybridSearch } from "@repo/db-upstash-vector/rag";

const results = await hybridSearch({
  query: "machine learning tutorials",
  // highlight-start
  vectorWeight: 0.7, // 70% semantic
  keywordWeight: 0.3, // 30% keyword matching
  // highlight-end
  topK: 10
});

Index Statistics

import { vectorIndex } from "@repo/db-upstash-vector";

const stats = await vectorIndex.info();
// {
//   vectorCount: 10000,
//   pendingVectorCount: 5,
//   indexSize: 1024000,
//   dimension: 1536,
//   similarityFunction: "COSINE"
// }

Best Practices

Batch Operations

Rate Limits

Upstash has rate limits on upserts. Use batch sizes of 100-1000 vectors depending on your plan.

import { vectorIndex } from "@repo/db-upstash-vector";

// Batch upserts for better performance
// highlight-next-line
const BATCH_SIZE = 100;

for (let i = 0; i < vectors.length; i += BATCH_SIZE) {
  const batch = vectors.slice(i, i + BATCH_SIZE);
  await vectorIndex.upsert(batch);
}

Metadata Design

// Good: Filterable fields at top level
await vectorIndex.upsert({
  id: "doc_123",
  vector: [...],
  metadata: {
    // highlight-start
    type: "article", // Filterable
    category: "tech", // Filterable
    publishedAt: "2024-01-15", // Filterable
    // highlight-end
    content: "Full text...", // Retrieved with results
  },
});

Environment Variables

# Upstash Vector
UPSTASH_VECTOR_REST_URL="https://xxx-vector.upstash.io"
UPSTASH_VECTOR_REST_TOKEN="AXxx..."

# Embedding provider
OPENAI_API_KEY="sk-..."
# or
VOYAGE_API_KEY="..."

@repo/ai - AI SDK for LLM integration
@repo/db-upstash-redis - Redis caching

@repo/db-upstash-vector

On this page