Capstone Project: PrepMind
AI RAG Assistant
You've completed all 10 checkpoints. Now build a real, deployed, interview-ready portfolio project — a RAG chatbot that answers questions over your own knowledge base, using Firestore's native vector search and Gemini API. 100% on the free tier.
What You'll Build
A fully deployed RAG chatbot that demonstrates everything from CP-01 to CP-10.
Indexing Pipeline
Upload documents → chunk with recursive splitting → embed with Gemini text-embedding-004 → store in Firestore's native vector index. Supports PDF, Markdown, and plain text.
RAG Query Engine
User asks a question → embed query → Firestore vector search retrieves top 5 chunks → assemble context prompt → Gemini generates grounded answer with citations.
Firebase Hosting UI
A clean chat interface hosted on Firebase Hosting. Google login via Firebase Auth. Conversation history stored in Firestore. Streaming responses via Firebase Genkit.
Architecture Overview
PREPMIND — RAG Chatbot Architecture
[Firebase Hosting] ← Chat UI (HTML + JS)
↓ user query
[Firebase Genkit Flow] ← orchestration layer
↙ ↘
[Gemini Embeddings] [Firestore Vector Search]
(query → vector) (retrieve top 5 chunks)
↘ ↙
[Context Assembler]
↓
[Gemini 1.5 Flash] ← generates answer
↓
[Streaming Response] → [Chat UI]
[Indexing Pipeline] (run once or on schedule)
[Your Documents] → [Chunker] → [Gemini Embedder] → [Firestore]
Build in 4 Phases
Phase 1 — Firebase Setup (30 min)
Create a Firebase project (free Spark plan). Enable Firestore, Firebase Auth (Google provider), and Firebase Hosting. Install Firebase Genkit.
npm install -g firebase-tools
firebase login
firebase init # select Hosting, Firestore, Functions
npm install @genkit-ai/firebase @genkit-ai/googleai
Phase 2 — Indexing Pipeline (45 min)
Write a Node.js script that reads documents, chunks them into 512-token pieces, embeds each chunk with Gemini's text-embedding-004, and stores in Firestore with the vector field.
// index-documents.js
import { GoogleGenerativeAI } from "@google/generative-ai";
import { initializeApp } from "firebase-admin/app";
import { getFirestore, FieldValue } from "firebase-admin/firestore";
const genai = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY);
const db = getFirestore();
async function indexDocument(text, source) {
const chunks = chunkText(text, 512, 50); // 512 tokens, 50 overlap
for (const chunk of chunks) {
const result = await genai.getGenerativeModel({ model: "text-embedding-004" })
.embedContent(chunk);
await db.collection("knowledge_base").add({
text: chunk,
source: source,
embedding: FieldValue.vector(result.embedding.values),
indexed_at: new Date(),
});
}
console.log(`Indexed ${chunks.length} chunks from ${source}`);
}
Phase 3 — Genkit RAG Flow (45 min)
Build the Genkit flow that handles query embedding, Firestore vector search, context assembly, and Gemini generation with streaming.
// rag-flow.ts (Firebase Genkit)
import { defineFlow, run } from "@genkit-ai/flow";
import { gemini15Flash } from "@genkit-ai/googleai";
import { textEmbedding004 } from "@genkit-ai/googleai";
export const prepMindFlow = defineFlow(
{ name: "prepMindFlow", inputSchema: z.string(), outputSchema: z.string() },
async (userQuery) => {
// 1. Embed the query
const queryEmbedding = await run("embed-query", () =>
ai.embed({ embedder: textEmbedding004, content: userQuery })
);
// 2. Firestore vector search
const snapshot = await db.collection("knowledge_base")
.findNearest("embedding", queryEmbedding, {
limit: 5,
distanceMeasure: "COSINE",
})
.get();
// 3. Assemble context
const context = snapshot.docs
.map(d => `[Source: ${d.data().source}]\n${d.data().text}`)
.join("\n\n---\n\n");
// 4. Generate answer
const { text } = await ai.generate({
model: gemini15Flash,
system: `Answer using ONLY the context below. Cite sources.
If the answer is not in the context, say "I don't have information on that."
Context:\n${context}`,
prompt: userQuery,
});
return text;
}
);
Phase 4 — Deploy & Present (30 min)
Deploy to Firebase Hosting. Test with 10 real questions. Add a feedback button (thumbs up/down stored in Firestore) so you can show the evaluation loop in your interview.
firebase deploy --only hosting,functions
# Your live URL: https://your-project.web.app
What to say in interviews: "I built PrepMind — a RAG chatbot deployed on Firebase that uses Firestore's native vector search and Gemini API. I can walk you through the indexing pipeline, the query flow, and the evaluation loop I added."