Building RAG Apps

Create AI applications powered by your documents and text

RAG Flow Overview

Three-step retrieval-augmented generation

User Question

"What are the Q3 revenue figures?"

1 Search

Semantic Search POST /seeds/query

Find relevant document/text chunks using vector similarity

2 Context

Generate Context POST /seeds/generate-context

Compile matching seeds into LLM-ready context

3 Generate

LLM Generation OpenAI / Claude

Send context + question to LLM for grounded response

AI Response

"Based on the Q3 report, revenue was $12.5M..."

User Question

1 Semantic Search

POST /seeds/query

2 Generate Context

POST /seeds/generate-context

3 LLM Generation

OpenAI / Claude

AI Response

Generate Context

Compile seeds into LLM-ready context

const response = await fetch(`${baseUrl}/seeds/generate-context?${params}`, {
  method: 'POST',
  headers: { ...headers, 'Content-Type': 'application/json' },
  body: JSON.stringify({
    seedIds: ['seed-123', 'seed-456'],
    model: 'gpt-4'
  })
});

const context = await response.json();
// {
//   content: "Document: Q3 Report\n\nThe quarterly...",
//   totalTokens: 2500,
//   seedCount: 2,
//   seeds: [{ id: "...", title: "...", tokens: 1500 }]
// }

Complete RAG with OpenAI

import OpenAI from 'openai';

const openai = new OpenAI();

async function ragQuery(question: string): Promise<string> {
  // 1. Search for relevant chunks
  const searchRes = await fetch(`${baseUrl}/seeds/query?${params}`, {
    method: 'POST',
    headers: { ...headers, 'Content-Type': 'application/json' },
    body: JSON.stringify({ query: question, limit: 5, threshold: 0.7 })
  });
  const { results } = await searchRes.json();

  if (results.length === 0) {
    return "I couldn't find relevant information.";
  }

  // 2. Generate context
  const seedIds = [...new Set(results.map((r: any) => r.seedId))];
  const contextRes = await fetch(`${baseUrl}/seeds/generate-context?${params}`, {
    method: 'POST',
    headers: { ...headers, 'Content-Type': 'application/json' },
    body: JSON.stringify({ seedIds, model: 'gpt-4' })
  });
  const context = await contextRes.json();

  // 3. Generate with OpenAI
  const completion = await openai.chat.completions.create({
    model: 'gpt-4',
    messages: [
      {
        role: 'system',
        content: `Answer based on this context:\n${context.content}`
      },
      { role: 'user', content: question }
    ]
  });

  return completion.choices[0].message.content || '';
}

Complete RAG with Claude

import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic();

async function ragQuery(question: string): Promise<string> {
  // 1. Search + 2. Generate context (same as above)
  const results = await searchSeeds(question);
  const context = await generateContext(results);

  // 3. Generate with Claude
  const message = await anthropic.messages.create({
    model: 'claude-3-opus-20240229',
    max_tokens: 1024,
    system: `Answer based on this context:\n${context.content}`,
    messages: [{ role: 'user', content: question }]
  });

  return message.content[0].text;
}

Chatbot with History

class RAGChatbot {
  private openai = new OpenAI();
  private history: { role: string; content: string }[] = [];

  async chat(userMessage: string): Promise<string> {
    this.history.push({ role: 'user', content: userMessage });

    // Search and get context
    const results = await searchSeeds(userMessage);
    const context = results.length > 0
      ? await generateContext(results)
      : { content: '' };

    // Generate response with history
    const completion = await this.openai.chat.completions.create({
      model: 'gpt-4',
      messages: [
        {
          role: 'system',
          content: `You have access to user documents.
${context.content ? `Context:\n${context.content}` : ''}`
        },
        ...this.history
      ]
    });

    const response = completion.choices[0].message.content || '';
    this.history.push({ role: 'assistant', content: response });

    // Keep last 20 messages
    if (this.history.length > 20) {
      this.history = this.history.slice(-20);
    }

    return response;
  }
}

const bot = new RAGChatbot();
console.log(await bot.chat('What is in my documents?'));
console.log(await bot.chat('Tell me more about the revenue'));

Best Practices

Token Management

Check totalTokens before sending to LLM
GPT-4 Turbo: 128K context, Claude 3: 200K context
Leave room for response tokens (1000-4000)

Search Optimization

Start with threshold 0.7, adjust based on results
Use 3-5 results for focused answers
Filter by bundles for domain-specific queries

Security

Never expose API keys in frontend code
Use environment variables for credentials
Use externalUserId for proper data isolation

Resources

Search API Reference

Full query endpoint docs

Generate Context API

Context generation docs