Skip to content

Kwasi — Personal Assistant

Kwasi is a personal AI assistant available via Telegram (primary), WhatsApp, and CLI. It handles scheduling, email, notes, tasks, reminders, web search, weather, GitHub, browser automation, and more — all through natural conversation.

This documentation covers how the system actually works: its architecture, data flows, memory model, tools, and deployment setup.


At a Glance

Concern Choice
Framework Pydantic AI — agent loop, tool registration, streaming
LLM Three model roles: MODEL_NAME (primary), MINI_MODEL_NAME (skill synthesis), REFLECTION_MODEL_NAME (nightly reflection) — supports Gemini, Claude, GPT-4
Primary interface Telegram (long-polling via python-telegram-bot)
Secondary interface WhatsApp webhook
Web framework FastAPI — health check, reflection endpoint, WhatsApp webhook, dashboard
Storage SQLite (local dev) / PostgreSQL (Railway production)
Email & calendar Gmail (personal + work) + Outlook via MCP tool wrappers
Multimodal Gemini Vision/STT — images, PDFs, voice messages
Voice output Microsoft Neural TTS via edge-tts — voice-in → voice-out
Browser Playwright (headless Chromium) — JS-rendered pages + form submission
GitHub PyGitHub — PRs, issues, notifications, repo summaries
Jira Jira Cloud REST API — issues, projects, JQL search, create/update tickets
Google Drive Drive API — search, read, list recent, meeting transcripts (personal + work)
Slack Slack SDK — read channels, unreads, post messages, search history
Intent routing Keyword classifier → 12-domain agent dispatch (email, calendar, memory, github, news, slack, jira, drive, meetings, diagnostics, health, utility); context-aware follow-up inheritance; semantic fallback
Plugin system File-drop skill registry — drop a .py into app/skills/ — built-ins: Content Curator, Travel Briefing, CV, Deep Research, Meeting Intelligence
Personality Explicit directives in system prompt: direct voice, Ghanaian cultural context, banned apology openers, retry-first error handling, situational tone calibration
Semantic search Gemini gemini-embedding-001 (3072 dims) — vector embeddings stored in SQLite (JSON) / PostgreSQL (pgvector) for meaning-based search across notes, history, and saved articles
Wearable / health data (Spec 010) Sideloaded Kotlin Android bridge (bridge-android/) reads Google Health Connect every ~15 min and POSTs to POST /health/ingest. Stored as a single normalised health_samples table; queried by a dedicated health_agent
Deployment Railway (Dockerfile)

Key Design Principles

Single AgentDeps instance. All interfaces (Telegram, WhatsApp, CLI) use the same build_deps() factory and share one AgentDeps object per process. Each request gets a shallow per-request copy (via dataclasses.replace) with the correct telegram_chat_id and active_categories — the shared singleton is never mutated.

Fail-safe everything. Storage errors never surface to the user. Tool failures are reported gracefully. The system prompt context fetch is wrapped in try/except so a broken DB never kills the agent.

Storage is pluggable. A StoragePort protocol defines the interface. SqliteAdapter and PostgresAdapter are drop-in implementations. Switching is a single env var change (STORAGE_BACKEND=postgres). Both adapters also implement vector similarity search — Python cosine in SQLite, pgvector HNSW in Postgres.

Three-tier memory pipeline. Facts stated in conversation are extracted within seconds (post-message mini-model pass). Sessions are summarised ~30 minutes after going quiet (session-close mini-model pass). A nightly Reflection Engine distils the full history into a narrative profile, intentions, and behavioral learnings. All tiers feed the same user_facts and notes tables — the agent always has the freshest possible context. Before each agent run, up to three semantic retrieval layers (notes, summaries, saved articles) inject relevant context into the user turn, sharing a 1,000-token budget with XML tagging so the model can distinguish retrieved memory from instructions.

Background loops run inside FastAPI lifespan. The reminder loop, daily briefing, scheduled tasks loop, alert loop, and nightly reflection are asyncio tasks started on startup and cancelled on shutdown — no separate workers or queues.

Extensible by file-drop. Adding a new skill requires only creating a file in app/skills/ — no changes to core agent code.

Intent-routed for efficiency. Every message (Telegram and WhatsApp) is classified by a keyword matcher in microseconds. The message is then handled by a focused domain agent carrying only its relevant tools — reducing token usage per call and improving model accuracy. The full agent is always the fallback for ambiguous or cross-domain requests.


Quick Navigation

  • Curriculum — Pedagogic walkthrough of the whole codebase in 8 sessions (start here if you want to understand the system)
  • Architecture — Component diagram and how the pieces fit together
  • Message Lifecycle — Sequence diagrams for text, voice, image, and document messages
  • Memory & Reflection — How short-term history and long-term context work
  • Tools Reference — Every tool available to the agent (native, skills, MCP, GitHub, browser)
  • Background Loops — Reminders, briefing, scheduled tasks, nightly reflection
  • Storage Layer — Schema, adapters, and the storage protocol
  • Deployment — Railway setup and full environment variable reference