Troubleshooting Runbooks

Troubleshooting runbooks eliminate debugging time and resolve production issues fast — providing systematic diagnostic steps and proven solutions for all common OneApp package problems.

Quick navigation

Skip to issue reference →

Why troubleshooting runbooks matter

Without systematic troubleshooting, debugging becomes chaotic:

Wasted debugging time — Spend hours Googling errors, trying random fixes
Production fires — Critical issues hit production, no clear fix path
Repeated mistakes — Same problems solved multiple times by different developers
Knowledge gaps — Only senior developers know how to debug complex issues
Trial and error — Try fixing without understanding root cause
Lost context — Can't remember how similar issue was fixed last time

OneApp's troubleshooting runbooks provide systematic diagnostic guides for 8 common problem categories — covering authentication (session undefined, CORS errors), analytics (events not tracking, feature flags failing), AI chat (timeouts, model errors), database (Prisma client errors, unique constraints), observability (missing logs, Sentry issues), storage (upload failures), and performance (slow loads, Web Vitals) — with clear symptoms, diagnosis steps, and proven solutions.

Production-ready with step-by-step diagnostic procedures, copy-paste working solutions, symptom-to-solution mapping, error code explanations, root cause analysis included, common pitfalls highlighted, and tested fixes from real production issues.

Use cases

Use these runbooks to:

Debug production issues — Follow systematic steps when errors occur in production
Investigate user reports — Match symptoms to known issues and apply fixes
Onboard developers — Give new team members proven debugging procedures
Document solutions — Reference solutions for recurring problems
Prevent repeat issues — Learn root causes to avoid future occurrences
Reduce MTTR — Mean Time To Resolution drops with systematic procedures

Quick Start

1. Identify your problem

Match your symptoms to a runbook category:

// Authentication problems?
→ See Authentication Issues

// Analytics not tracking?
→ See Analytics Issues

// AI chat not responding?
→ See AI & Chat Issues

// Database query errors?
→ See Database Issues

// Errors not appearing in Sentry?
→ See Observability Issues

// File uploads failing?
→ See Storage & Upload Issues

// Page loads slowly?
→ See Performance Issues

That's it! You've identified the right runbook to follow.

2. Follow diagnostic steps

Run each diagnostic step in order:

# Step 1: Check environment
echo $DATABASE_URL

# Step 2: Verify configuration
grep -r "config" app/

# Step 3: Test specific functionality
curl -X POST http://localhost:3000/api/test

That's it! Systematic diagnostics identify the root cause.

3. Apply the solution

Copy-paste the solution code and customize:

// From runbook
import { useAuth } from "@repo/auth/client/next";
const { session, isLoading } = useAuth();

// Customize for your needs
if (isLoading) return <Spinner />;
if (!session) return <LoginPrompt />;

That's it! You've resolved the issue systematically.

Complete issue reference

All issues by category

Authentication

Issue	Quick Symptoms
Session is undefined →	Cannot read property 'id' of undefined, login doesn't work
Invalid credentials / CORS →	Login always fails, CORS error in console

Use when: Authentication not working, login failing, session issues

Analytics & Tracking

Issue	Quick Symptoms
Events not tracking →	No events in PostHog, tracking calls not working
Feature flags always false →	useFeatureFlag returns false, features never show

Use when: Analytics not working, events not tracking, feature flags failing

AI & Chat

Issue	Quick Symptoms
Chat not responding →	Request hangs, timeout errors, no AI response
Model not found →	"Model not found" error, chat breaks with specific model

Use when: AI chat not responding, model errors, timeout issues

Database

Issue	Quick Symptoms
Prisma client not initialized →	Prisma client error, database queries fail
Unique constraint failed →	Duplicate key error, constraint violation

Use when: Database errors, Prisma issues, constraint violations

Observability

Issue	Quick Symptoms
Errors not in Sentry →	Errors occur but don't appear in Sentry, no logs

Use when: Errors not tracked, missing logs, Sentry issues

Storage & Files

Issue	Quick Symptoms
File upload fails →	Upload returns success but file not in storage, timeout

Use when: File upload failing, storage issues, timeout on uploads

Performance

Issue	Quick Symptoms
Page loads slowly →	Slow initial load (LCP > 4s), laggy interactions

Use when: Performance issues, slow page loads, high Web Vitals

Authentication Issues

Symptoms:

Getting Cannot read property 'id' of undefined errors
Login form submission doesn't redirect
Session appears null even after login

Diagnosis Steps:

# Step 1: Check if AuthProvider is wrapping the app
grep -r "AuthProvider" app/providers.tsx
# Expected: <AuthProvider> wraps your app

# Step 2: Verify you're using correct imports
grep -r "useAuth" app/
# Must be: "use client"; import { useAuth } from "@repo/auth/client/next"
# NOT: import { useAuth } from "@repo/auth/server/next"

# Step 3: Check session retrieval
# In server components: const session = await auth.api.getSession()
# In client components: const { session } = useAuth()

Solution:

// ✅ Correct: app/providers.tsx
"use client";
import { AuthProvider } from "@repo/auth/client/next";

export function Providers({ children }) {
  return <AuthProvider>{children}</AuthProvider>;
}

// ✅ Correct: app/dashboard/page.tsx (Client)
("use client");
import { useAuth } from "@repo/auth/client/next";

export function Dashboard() {
  const { session, isLoading } = useAuth();

  if (isLoading) return <div>Loading...;
  if (!session) return <div>Please log in;

  return <div>Welcome {session.user.name};
}

// ✅ Correct: app/api/user/route.ts (Server)
import { auth } from "@repo/auth/server/next";

export async function GET() {
  const session = await auth.api.getSession();
  if (!session) return new Response("Unauthorized", { status: 401 });
  return Response.json(session.user);
}

Root cause: Mixing server/client imports or not checking loading state.

Symptoms:

Login always fails even with correct credentials
CORS error in console
Network shows 401 or 403 response

Diagnosis:

# Step 1: Check browser console for specific error
# Look for CORS, 401, 403, or auth-specific messages

# Step 2: Verify environment variables
echo $NEXT_PUBLIC_AUTH_URL
# Should be set to your app domain

# Step 3: Check password requirements
# Minimum 8 characters by default
# Check @repo/auth configuration

# Step 4: Test with curl
curl -X POST http://localhost:3000/api/auth/signin \
  -H "Content-Type: application/json" \
  -d '{"email":"test@example.com","password":"password123"}'

Solution:

// Ensure password meets requirements (8+ chars)
const password = "longpassword123"; // ✅ Good

// Check API endpoint is correct
const response = await fetch("/api/auth/signin", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ email, password })
});

if (!response.ok) {
  // Log detailed error
  const error = await response.json();
  console.error("Auth error:", error);
}

Root cause: Password requirements not met or incorrect API endpoint.

Analytics Issues

Problem: Analytics not tracking / Events not showing in PostHog

Symptoms:

No events appearing in PostHog dashboard
console.log shows tracking calls but no data
Analytics seem to work locally but not in production

Diagnosis:

# Step 1: Verify API key
echo $NEXT_PUBLIC_POSTHOG_KEY
# Should output your PostHog API key (starts with phc_)

# Step 2: Check network requests
# Open browser DevTools → Network tab
# Send an event and look for requests to:
#   - app.posthog.com
#   - your domain (if using proxy)
# Should see 200 OK responses

# Step 3: Verify AnalyticsProvider setup
grep -A 20 "AnalyticsProvider" app/providers.tsx

# Step 4: Check event naming
# Event names should not have special characters
# Only alphanumeric, underscores, spaces

Solution:

// ✅ Step 1: Verify provider setup
// app/providers.tsx
"use client";
import { AnalyticsProvider } from "@repo/analytics/client/next";

export function Providers({ children }) {
  return (
    <AnalyticsProvider
      config={{
        providers: {
          posthog: {
            // Ensure API key is from environment
            apiKey: process.env.NEXT_PUBLIC_POSTHOG_KEY!,
            options: {
              api_host: "https://app.posthog.com",
              // Disable auto-capture if not needed
              capture_pageview: false,
              autocapture: false
            }
          }
        },
        debug: true // Enable debugging temporarily
      }}
    >
      {children}
    </AnalyticsProvider>
  );
}

// ✅ Step 2: Verify tracking code
("use client");
import { useAnalytics, track } from "@repo/analytics/client/next";

export function TrackingExample() {
  const analytics = useAnalytics();

  const handleClick = async () => {
    // Use valid event name (alphanumeric + underscores + spaces)
    const event = track("Button Clicked", {
      button_name: "submit", // underscores not spaces in property names
      timestamp: Date.now()
    });
    await analytics.emit(event);
  };

  return <button onClick={handleClick}>Click me</button>;
}

// ✅ Step 3: Check browser console for debug output
// With debug: true, should see:
// "[PostHog] Sending event: Button Clicked"

Root cause: Missing API key, incorrect provider setup, or invalid event naming.

Problem: Feature flags always return false

Symptoms:

useFeatureFlag("feature-name") always returns false
Feature never shows even when enabled in PostHog
Can't toggle features

Diagnosis:

# Step 1: Verify feature flag exists in PostHog
# Go to PostHog dashboard → Feature Flags
# Check flag key matches your code

# Step 2: Check PostHog initialization
grep -r "posthog" app/
# Should see AnalyticsProvider or useAnalytics

# Step 3: Verify person/user identification
# Feature flags require user context
# Check that identify() is called before checking flags

Solution:

// ✅ Correct flow: Identify THEN check flag
"use client";
import { useAnalytics } from "@repo/analytics/client/next";
import { useFeatureFlag } from "@repo/feature-flags/client";
import { useAuth } from "@repo/auth/client/next";
import { useEffect } from "react";

export function FeatureComponent() {
  const { session } = useAuth();
  const analytics = useAnalytics();
  const isEnabled = useFeatureFlag("beta-dashboard");

  useEffect(() => {
    // Step 1: Identify user FIRST
    if (session?.user) {
      analytics.identify(session.user.id, {
        email: session.user.email
      });
    }
  }, [session, analytics]);

  // Step 2: Then check flag
  if (!isEnabled) {
    return <OldDashboard />;
  }

  return <NewBetaDashboard />;
}

Root cause: User not identified before checking feature flag.

AI & Chat Issues

Problem: AI chat not responding / Timeouts

Symptoms:

Chat request hangs indefinitely
"Operation timed out" errors
No response from AI model

Diagnosis:

# Step 1: Check API keys
echo $OPENAI_API_KEY
echo $ANTHROPIC_API_KEY
# At least one provider key should be set

# Step 2: Check rate limits
# Look for 429 errors in logs
# Indicates rate limit exceeded

# Step 3: Verify model availability
# Some models may not be available in your region
# Check provider status pages

# Step 4: Check message length
# Very large messages may timeout
# Try with shorter test message

Solution:

// ✅ Correct: Add timeout and better error handling
import { Chat } from "@repo/ai/generation";
import { logWarn, logError } from "@repo/shared";
import { getObservability } from "@repo/observability/server/next";

export async function POST(req: Request) {
  try {
    const { messages } = await req.json();

    // Set reasonable timeout
    const controller = new AbortController();
    const timeoutId = setTimeout(() => controller.abort(), 30000); // 30s

    try {
      const stream = await Chat.stream(messages, {
        model: models.balanced,
        maxTokens: 500
      });

      clearTimeout(timeoutId);
      return stream.toUIMessageStreamResponse();
    } catch (error) {
      clearTimeout(timeoutId);
      throw error;
    }
  } catch (error) {
    // Log specific error
    const observability = await getObservability();

    if (error instanceof Error) {
      if (error.message.includes("429")) {
        logWarn("Rate limit exceeded", {
          endpoint: "/api/chat"
        });
        return new Response("Too many requests", { status: 429 });
      }

      if (error.message.includes("timeout")) {
        logError(error, {
          context: "chat-timeout"
        });
        observability.captureException(error, { context: "chat-timeout" });
        return new Response("Request timeout", { status: 504 });
      }
    }

    logError(error as Error, {
      context: "ai-chat"
    });
    observability.captureException(error as Error, { context: "ai-chat" });
    return new Response("AI service error", { status: 500 });
  }
}

Root cause: Missing API keys, rate limits, or message too large.

Problem: "Model not found" or "Invalid model"

Symptoms:

Error: "Model claude-3-5-sonnet not found"
Chat breaks when trying to use specific model
Works locally but fails in production

Diagnosis:

# Step 1: Verify model name is correct
# Common formats:
# - OpenAI: gpt-4o, gpt-4-turbo, gpt-4o-mini
# - Anthropic: claude-3-5-sonnet-20241022
# - Google: gemini-2.0-flash-exp

# Step 2: Check provider API key
echo $ANTHROPIC_API_KEY
# If empty, Anthropic models won't work

# Step 3: Use model presets
# models.fast, models.balanced, models.reasoning
# These auto-select available models

Solution:

// ✅ Use model presets (recommended)
import { models } from "@repo/ai";
import { generateText } from "ai";

const result = await generateText({
  model: models.balanced, // Auto-selects based on available keys
  prompt: "Hello"
});

// ✅ Or explicitly verify model availability
import { registry } from "@repo/ai/providers";

const availableModels = registry.getAvailableModels();
console.log("Available:", availableModels);
// Output: ["openai/gpt-4o", "anthropic/claude-3-5-sonnet", ...]

const modelName = "claude-3-5-sonnet-20241022"; // Full model name
const model = registry.languageModel(`anthropic/${modelName}`);

if (!model) {
  console.error(`Model not available: ${modelName}`);
  console.log("Available models:", availableModels);
}

Root cause: Model name incorrect or provider API key not set.

Database Issues

Problem: "Prisma client not initialized" / Cannot query database

Symptoms:

Error: "Prisma client not initialized"
Database queries fail
TypeScript errors with db imports

Diagnosis:

# Step 1: Check Prisma is installed
grep "@repo/db-prisma" package.json
pnpm list @repo/db-prisma

# Step 2: Verify schema exists
ls -la packages/oneapp-shared/prisma/schema.prisma

# Step 3: Check client generation
pnpm exec prisma generate --schema=packages/oneapp-shared/prisma/schema.prisma

# Step 4: Verify DATABASE_URL
echo $DATABASE_URL
# Should output your database connection string

Solution:

# ✅ Fix Prisma issues step-by-step
# 1. Reinstall Prisma
pnpm install @repo/db-prisma

# 2. Regenerate client
pnpm exec prisma generate --schema=packages/oneapp-shared/prisma/schema.prisma

# 3. Clear cache
rm -rf node_modules/.prisma

# 4. Reinstall
pnpm install

# 5. Verify DATABASE_URL is set
export DATABASE_URL="postgresql://user:password@localhost:5432/db"

Root cause: Prisma client not generated or DATABASE_URL not set.

Problem: "Unique constraint failed" / Duplicate key errors

Symptoms:

Error: "Unique constraint failed on the fields: (email)"
Can't create user because email already exists
Unexpected constraint violations

Diagnosis:

// Step 1: Check what's unique in schema
// Look for @unique or @@unique in schema.prisma

// Step 2: Verify data before insert
const existingUser = await db.user.findUnique({
  where: { email }
});

if (existingUser) {
  console.error("User already exists with this email");
}

// Step 3: Check error details
try {
  await db.user.create({ data: { email } });
} catch (error) {
  if (error instanceof PrismaError && error.code === "P2002") {
    console.error("Unique constraint violated on:", error.meta?.target);
  }
}

Solution:

// ✅ Handle unique constraints gracefully
import { db } from "@repo/db-prisma";

export async function createOrUpdateUser(email: string) {
  try {
    // Try to update first
    const user = await db.user.update({
      where: { email },
      data: { lastLogin: new Date() }
    });
    return user;
  } catch (error) {
    // If not found, create
    if (error?.code === "P2025") {
      return await db.user.create({
        data: { email, name: "New User" }
      });
    }
    throw error;
  }
}

// ✅ Or use upsert
const user = await db.user.upsert({
  where: { email },
  update: { lastLogin: new Date() },
  create: { email, name: "New User" }
});

Root cause: Unique constraint on field that already has duplicate value.

Observability Issues

Problem: Errors not appearing in Sentry / No logs

Symptoms:

Errors occur but don't appear in Sentry
No logs in BetterStack
Error logging seems to be silently failing

Diagnosis:

# Step 1: Check Sentry DSN
echo $NEXT_PUBLIC_SENTRY_DSN
# Should be your Sentry project URL

# Step 2: Verify Sentry initialization
grep -r "Sentry.init" app/
# Should see initialization in sentry.client.config.ts and sentry.server.config.ts

# Step 3: Check error capture
# Enable Sentry debug mode temporarily
// sentry.server.config.ts
Sentry.init({
  debug: true, // Logs to console
  // ...
});

# Step 4: Verify environment
# Errors may not be sent in development
echo $NODE_ENV

Solution:

// ✅ Proper error logging
import { logError } from "@repo/shared";
import { getObservability } from "@repo/observability/server/next";
import * as Sentry from "@sentry/nextjs";

export async function POST(req: Request) {
  try {
    // Business logic
    await riskyOperation();
  } catch (error) {
    // Method 1: Using shared logger + observability for error tracking
    logError(error as Error, {
      context: "api-endpoint"
    });
    const observability = await getObservability();
    observability.captureException(error as Error, {
      tags: { severity: "high" }
    });

    // Method 2: Direct Sentry capture (also works)
    Sentry.captureException(error);

    // Return safe error response
    return new Response("Internal error", { status: 500 });
  }
}

// ✅ Verify in development
// Set NODE_ENV=production locally to test error sending
// NODE_ENV=production npm run dev

Root cause: Sentry DSN not set or NODE_ENV=development (errors not sent).

Storage & Upload Issues

Problem: File upload fails silently / Files not appearing

Symptoms:

Upload returns success but file not in storage
S3/Blob upload timeout
Large files fail but small ones work

Diagnosis:

# Step 1: Check storage credentials
echo $NEXT_PUBLIC_STORAGE_TYPE
# Should be "blob", "r2", or "images"

# Step 2: Verify bucket exists
# Depends on provider (Vercel Blob, R2, etc.)

# Step 3: Check file size limits
# Most providers have max file size limits
# Vercel Blob: 500MB max
# R2: 5TB max

# Step 4: Check network
# Large uploads may fail due to timeout
# Verify internet connection is stable

Solution:

// ✅ Proper file upload with error handling
import { uploadFile } from "@repo/storage/client/next";
import { logError } from "@repo/shared";
import { getObservability } from "@repo/observability/client/next";

export async function handleFileUpload(file: File) {
  try {
    // Validate file
    const maxSize = 10 * 1024 * 1024; // 10MB
    if (file.size > maxSize) {
      throw new Error(`File too large. Max ${maxSize / 1024 / 1024}MB`);
    }

    // Upload with error handling
    const result = await uploadFile({
      file,
      bucket: "uploads"
    });

    if (!result) {
      throw new Error("Upload failed - no result returned");
    }

    return result;
  } catch (error) {
    // Log error for debugging
    logError(error as Error, {
      context: "file-upload",
      fileName: file.name,
      fileSize: file.size
    });
    const observability = await getObservability();
    observability.captureException(error as Error, {
      context: "file-upload",
      fileName: file.name,
      fileSize: file.size
    });

    // Show user-friendly message
    throw new Error("Upload failed. Please try again.");
  }
}

Root cause: File too large, invalid bucket, or network timeout.

Performance Issues

Problem: Page loads slowly / High Core Web Vitals

Symptoms:

Slow initial page load (LCP > 4s)
Laggy interactions (INP > 200ms)
Layout shifts (CLS > 0.1)

Diagnosis:

# Step 1: Check Web Vitals
# Use PageSpeed Insights: https://pagespeed.web.dev/

# Step 2: Enable performance monitoring
# Check observability configuration in your app
# Should send Web Vitals to Sentry/PostHog

# Step 3: Profile locally
# Use Chrome DevTools → Performance tab
# Record page load and identify slow tasks

# Step 4: Check bundle size
pnpm build
# Check .next/static/chunks/ sizes

Solution:

// ✅ Optimize performance
// 1. Use dynamic imports for heavy components
import dynamic from "next/dynamic";
const HeavyComponent = dynamic(() => import("./HeavyComponent"), {
  loading: () => <div>Loading...,
});

// 2. Implement proper caching
export const revalidate = 3600; // Cache for 1 hour

// 3. Use React.memo for expensive renders
const MemoizedComponent = React.memo(MyComponent);

// 4. Monitor Web Vitals
import { useReportWebVitals } from "next/web-vitals";

export function WebVitalsReporter() {
  useReportWebVitals((metric) => {
    console.log(`${metric.name}: ${metric.value}ms`);
    // Send to analytics
  });
  return null;
}

Root cause: Large bundles, unoptimized images, or expensive client-side rendering.

Quick Troubleshooting Checklist

Before debugging

Check browser console for errors
Check server logs for errors
Clear browser cache (Cmd+Shift+R / Ctrl+F5)
Restart dev server (pnpm dev)
Try incognito/private browsing
Check environment variables are set

Systematic debugging process

Isolate — Can you reproduce consistently?
Search — Search error message in docs/GitHub issues
Trace — Follow the error stack trace
Log — Add console.log to track execution
Compare — Compare with working example
Ask — Search documentation or ask community

Next steps

Learn error patterns: Error Handling Guide →
See architecture: Architecture Diagrams →
Get quick examples: Quick Reference →

For Developers: Advanced troubleshooting techniques

Advanced Debugging Techniques

Enable verbose logging

Environment-wide debugging:

# All debug logs
DEBUG=* pnpm dev

# Package-specific logs
DEBUG=@repo/* pnpm dev

# Specific log level
LOG_LEVEL=debug pnpm dev

# Multiple debug namespaces
DEBUG=@repo/auth,@repo/analytics pnpm dev

Application-level debugging:

// Enable Prisma query logging
const prisma = new PrismaClient({
  log: ["query", "info", "warn", "error"]
});

// Enable Sentry debug mode
Sentry.init({
  debug: true,
  tracesSampleRate: 1.0
});

// Enable PostHog debug mode
posthog.init({
  debug: true
});

Trace AI operations

import { traceAIOperation } from "@repo/observability/server/next";

const result = await traceAIOperation(
  {
    name: "chat-completion",
    model: "gpt-4",
    input: messages
  },
  async (span) => {
    const response = await generateText({
      /* ... */
    });

    span.setAttributes({
      "ai.tokens.input": response.usage.promptTokens,
      "ai.tokens.output": response.usage.completionTokens,
      "ai.cost.usd": calculateCost(response.usage)
    });

    console.log("Span ID:", span.spanContext().spanId);
    console.log("Trace ID:", span.spanContext().traceId);

    return response;
  }
);

Monitor analytics events

"use client";
import { useAnalytics } from "@repo/analytics/client/next";
import { useEffect } from "react";

export function AnalyticsDebugger() {
  const analytics = useAnalytics();

  useEffect(() => {
    // Log all events
    analytics.on("track", (event) => {
      console.log("📊 Tracked:", event.name, event.properties);
    });

    // Log identify calls
    analytics.on("identify", (userId, traits) => {
      console.log("👤 Identified:", userId, traits);
    });

    // Log page views
    analytics.on("page", (category, name) => {
      console.log("📄 Page:", category, name);
    });
  }, [analytics]);

  return null;
}

Network request debugging

# Proxy all requests through debugging tool
# Install mitmproxy: brew install mitmproxy

# Start proxy
mitmproxy -p 8080

# Configure browser to use proxy
# Then visit http://mitm.it to install cert

# Now all requests visible in mitmproxy interface

Performance profiling

// React DevTools Profiler
import { Profiler } from "react";

function onRenderCallback(
  id: string,
  phase: "mount" | "update",
  actualDuration: number,
  baseDuration: number,
  startTime: number,
  commitTime: number
) {
  console.log(`${id} (${phase}) took ${actualDuration}ms`);
}

export function ProfiledComponent() {
  return (
    <Profiler id="MyComponent" onRender={onRenderCallback}>
      <MyComponent />
    </Profiler>
  );
}

Common Error Codes Reference

Prisma Error Codes

Code	Meaning	Solution
P2002	Unique constraint failed	Check for existing record before insert
P2025	Record not found	Verify ID exists before update/delete
P2003	Foreign key constraint failed	Ensure related record exists
P1001	Can't reach database	Check DATABASE_URL and connection

HTTP Status Codes

Code	Meaning	Common Cause
401	Unauthorized	Missing or invalid session
403	Forbidden	User doesn't have permission
404	Not Found	Resource doesn't exist
429	Too Many Requests	Rate limit exceeded
500	Internal Server Error	Unhandled exception in API

AI Provider Error Codes

Provider	Code	Meaning
OpenAI	429	Rate limit exceeded
OpenAI	401	Invalid API key
Anthropic	overloaded_error	Service temporarily unavailable
Anthropic	invalid_request_error	Malformed request

Creating Custom Runbooks

When to create a runbook

Create runbooks for:

Recurring issues — Problem happens multiple times across team
Complex diagnosis — Multi-step process to identify root cause
Production incidents — Issue that affected users
Onboarding gaps — New developers frequently stuck on same issue

Runbook template

## Problem: [Short problem description]

**Symptoms:**

- Bullet list of observable symptoms
- What user/developer sees when problem occurs
- Error messages, broken functionality

**Diagnosis:**

\`\`\`bash

# Step 1: Check [component]

[command to run]

# Expected: [what you should see]

# Step 2: Verify [configuration]

[command to run]

# Expected: [what you should see]

# Step 3: Test [functionality]

[command to run]

# Expected: [what you should see]

\`\`\`

**Solution:**

\`\`\`tsx // ✅ Correct implementation [working code example] \`\`\`

**Root cause**: [Brief explanation of why problem occurs]

Contributing runbooks

Process:

Encounter problem
Document symptoms
Record diagnosis steps
Capture working solution
Explain root cause
Submit PR with runbook

# Add to this file
platform/apps/docs/content/400-guides/100-troubleshooting-runbooks.mdx

# Commit
git add platform/apps/docs/content/400-guides/100-troubleshooting-runbooks.mdx
git commit -m "docs: add runbook for [problem]"

# Create PR
gh pr create --title "docs: add troubleshooting runbook for [problem]"

Error Handling Guide → — AsyncResult pattern and error boundaries
Quick Reference → — Common fixes and error solutions
Architecture Diagrams → — System flows for tracing issues
@repo/observability Documentation → — Error tracking setup

Troubleshooting Runbooks

All issues by category

On this page