You can start using RAG without setting up a big fancy vector database. I made a working setup with just embeddings stored in a file and a simple cosine similarity search.
The basic setup
1. Storing the embeddings
I did not use a database here. A plain JSON file worked fine.
interface Document {
id: string
content: string
embedding: number[]
metadata: {
title: string
date: string
}
}
const documents: Document[] = JSON.parse(fs.readFileSync('embeddings.json'))2. Finding the most similar documents
function cosineSimilarity(a: number[], b: number[]): number {
// this is just the dot product of the two vectors
const dot = a.reduce((sum, val, i) => sum + val * b[i], 0)
const magA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0))
const magB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0))
return dot / (magA * magB)
}
async function search(query: string, docs: Document[], topK: number = 3) {
// this will turn the query into an embedding
const queryEmbedding = await getEmbedding(query)
return docs
.map(doc => ({
...doc,
similarity: cosineSimilarity(queryEmbedding, doc.embedding)
}))
.sort((a, b) => b.similarity - a.similarity)
.slice(0, topK)
}How it performed for me
- The setup took about two hours. It's exponentially faster than the couple of days it can take to set up a real vector DB.
- The verage query time was around 50 milliseconds.
- Accuracy on my small test set was about 85 percent which is fair.
- Overall torage cost was basically next to nothing, as it's about five cents a month compared to fifty dollars or more for a managed vector DB.
When you might need to upgrade
Switch to a proper vector database if you have more than ten thousand documents, need instant updates, want advanced filtering, or need rock solid uptime for production.
Things I learned
- It's really worth starting with something simple and making it better over time.
- Good quality of data is more beneficial than using complex algorithms.
- Keep an eye on your embeddings so they stay consistent.
- Cache the embeddings you look up often to save time and cost.