Skip to content

Kwasi — Personal Assistant

Kwasi is a personal AI assistant available via Telegram (primary), WhatsApp, and CLI. It handles scheduling, email, notes, tasks, reminders, web search, weather, GitHub, browser automation, and more — all through natural conversation.

This documentation covers how the system actually works: its architecture, data flows, memory model, tools, and deployment setup.

Module layout changed (2026-06 cleanup)

A behavior-preserving decomposition split the three former "god files" into smaller modules. Where prose below says a tool, loop, or handler is "in agent.py / main.py / bot.py", the implementation now lives in:

  • Agent toolsapp/agent_tools/<domain>.py (one module per domain; agent.py is now thin and just wires + re-exports them). Tool implementations are still in app/tools/.
  • Background loopsapp/background.py (spawned by main.py's lifespan).
  • HTTP routes (/message, /reflect, /embed-backfill) → app/routers/api.py.
  • Telegrambot.py keeps the dispatch core; helpers.py / media.py / commands.py hold the shared utils, media handlers, and slash commands.
  • Prompt textapp/prompt_text.py; AgentDeps/build_depsapp/deps.py.

The original modules re-export their public symbols, so the data flows and behavior described here are unchanged. See CLAUDE.md → Recent Changes for the full map.


At a Glance

Concern Choice
Framework Pydantic AI — agent loop, tool registration, streaming
LLM Three model roles: MODEL_NAME (primary), MINI_MODEL_NAME (skill synthesis), REFLECTION_MODEL_NAME (nightly reflection) — supports Gemini, Claude, GPT-4
Primary interface Telegram (long-polling via python-telegram-bot)
Secondary interface WhatsApp webhook
Web framework FastAPI — health check, reflection endpoint, WhatsApp webhook, dashboard
Storage SQLite (local dev) / PostgreSQL (Railway production)
Email & calendar Gmail (personal + work) + Outlook via MCP tool wrappers
Multimodal Gemini Vision/STT — images, PDFs, voice messages
Voice output Microsoft Neural TTS via edge-tts — voice-in → voice-out
Browser Playwright (headless Chromium) — JS-rendered pages + form submission
GitHub PyGitHub — PRs, issues, notifications, repo summaries
Jira Jira Cloud REST API — issues, projects, JQL search, create/update tickets
Google Drive Drive API — search, read, list recent, meeting transcripts (personal + work)
Slack Slack SDK — read channels, unreads, post messages, search history
Task delegation Three E2B-sandbox backends — delegate_to_coding_agent (OpenCode default / Claude Code) + a dedicated delegate_web_task (browser-use + Gemini — DOM-driven scraping & multi-step form-fills); approval-gated, daily/wall-clock capped, async result delivery
Credential vault Fernet-encrypted website logins (app/vault.py, VAULT_MASTER_KEY) — delegate_web_task(credential=...) logs into real accounts without the LLM ever seeing the plaintext
Meeting follow-up Polls work Gmail for Gemini meeting notes, extracts Lawrence-owned action items, and sends a Telegram card to save as intentions/note
Intent routing Keyword classifier → 13-domain agent dispatch (email, calendar, memory, github, news, slack, jira, drive, meetings, diagnostics, health, database, utility); context-aware follow-up inheritance; semantic fallback with three confidence bands (≥0.65 single domain; 0.45–0.65 top-K composed agent; <0.45 full agent)
Self-improvement loop read_source_file / grep_source for code introspection; post-turn capability_gaps detection (regex prefilter → mini-model → grep-verify); propose_skill + static validator + approval-gated activation that hot-reloads onto every skill-bearing agent
Self-management Kwasi changes its own config and deployment from Telegram — set_runtime_config (instant, no restart), railway_set_env / railway_redeploy (approval-gated), diagnose_self health snapshot. Allowlist derived from a single CONFIG_REGISTRY; secrets blocked; proactive ✅/⚠️ after a self-redeploy. Requires the RAILWAY_* vars.
Observability Logfire (infra), Langfuse (LLM traces + managed prompts + scores), and optional W&B Weave (WEAVE_ENABLED) — all on one shared OpenTelemetry provider
Evaluation eval/ on pydantic-evals — free CI-gated routing eval, real-agent tool-selection sweep with an LLM judge, results in both Logfire & Langfuse Experiments, and a production-trace mining loop that feeds new cases back into the corpus. See Evaluation.
Plugin system File-drop skill registry — drop a .py into app/skills/ — built-ins: Content Curator, Travel Briefing, CV, Deep Research, Meeting Intelligence
Personality Explicit directives in system prompt: direct voice, Ghanaian cultural context, banned apology openers, retry-first error handling, situational tone calibration
Semantic search Gemini gemini-embedding-001 (3072 dims) — vector embeddings stored in SQLite (JSON) / PostgreSQL (pgvector) for meaning-based search across notes, history, and saved articles
Wearable / health data (Spec 010) Sideloaded Kotlin Android bridge (bridge-android/) reads Google Health Connect every ~15 min and POSTs to POST /health/ingest. Stored as a single normalised health_samples table; queried by a dedicated health_agent
Deployment Railway (Dockerfile)

Key Design Principles

Single AgentDeps instance. All interfaces (Telegram, WhatsApp, CLI) use the same build_deps() factory and share one AgentDeps object per process. Each request gets a shallow per-request copy (via dataclasses.replace) with the correct telegram_chat_id and active_categories — the shared singleton is never mutated.

Fail-safe everything. Storage errors never surface to the user. Tool failures are reported gracefully. The system prompt context fetch is wrapped in try/except so a broken DB never kills the agent.

Storage is pluggable. A StoragePort protocol defines the interface. SqliteAdapter and PostgresAdapter are drop-in implementations. Switching is a single env var change (STORAGE_BACKEND=postgres). Both adapters also implement vector similarity search — Python cosine in SQLite, pgvector HNSW in Postgres.

Two-tier memory pipeline. Facts stated in conversation are extracted within seconds (post-message mini-model pass). A nightly Reflection Engine then distils the full history into a narrative profile, intentions, behavioral learnings, and topic-cluster Summary: <topic> notes. Both tiers feed the same user_facts and notes tables — the agent always has the freshest possible context. Before each agent run, up to three semantic retrieval layers (notes, summaries, saved articles) inject relevant context into the user turn, sharing a 1,000-token budget with XML tagging so the model can distinguish retrieved memory from instructions.

Background loops run inside FastAPI lifespan. The reminder loop, daily briefing, scheduled tasks loop, alert loop, and nightly reflection are asyncio tasks started on startup and cancelled on shutdown — no separate workers or queues.

Extensible by file-drop. Adding a new skill requires only creating a file in app/skills/ — no changes to core agent code.

Intent-routed for efficiency. Every message (Telegram and WhatsApp) is classified by a keyword matcher in microseconds. The message is then handled by a focused domain agent carrying only its relevant tools — reducing token usage per call and improving model accuracy. The full agent is always the fallback for ambiguous or cross-domain requests.


Quick Navigation

  • Curriculum — Pedagogic walkthrough of the whole codebase in 8 sessions (start here if you want to understand the system)
  • Architecture — Component diagram and how the pieces fit together
  • Message Lifecycle — Sequence diagrams for text, voice, image, and document messages
  • Memory & Reflection — How short-term history and long-term context work
  • Tools Reference — Every tool available to the agent (native, skills, MCP, GitHub, browser)
  • Background Loops — Reminders, briefing, scheduled tasks, nightly reflection
  • Storage Layer — Schema, adapters, and the storage protocol
  • Evaluation — How the agent is evaluated: routing, tool-selection, LLM judge, and the production-trace feedback loop
  • Deployment — Railway setup and full environment variable reference