Deployment¶
Kwasi runs on Railway using Docker. The bot starts automatically on deploy.
How it Deploys¶
flowchart LR
A[Push to main] --> B[Railway detects push]
B --> C[Build Docker image\nDockerfile]
C --> D[Start container\npython -m app.main]
D --> E[FastAPI starts\nlifespan begins]
E --> F[Telegram bot starts\nlong-polling]
E --> G[Background loops start\nreminders · briefing · scheduled tasks · reflection]
Railway config (railway.json):
{
"build": { "builder": "DOCKERFILE" },
"deploy": {
"startCommand": "python -m app.main",
"restartPolicyType": "ON_FAILURE",
"restartPolicyMaxRetries": 10
}
}
Environment Variables¶
Required¶
| Variable | Description |
|---|---|
TELEGRAM_TOKEN |
Bot token from @BotFather |
ALLOWED_TELEGRAM_USER_IDS |
Comma-separated list of your Telegram user IDs. If unset, all messages are blocked. |
GOOGLE_API_KEY |
Gemini API key — used for LLM inference, Vision, STT, and text embeddings (semantic search) |
STORAGE_BACKEND |
postgres in production |
DATABASE_URL |
PostgreSQL connection string (set automatically by Railway Postgres plugin) |
LLM Provider (configure at least one)¶
| Variable | Description | Default |
|---|---|---|
GOOGLE_API_KEY |
For Gemini models | — |
ANTHROPIC_API_KEY |
For Claude models | — |
OPENAI_API_KEY |
For GPT models | — |
MODEL_NAME |
Primary model for all interactive responses | google-gla:gemini-2.5-flash |
MINI_MODEL_NAME |
Fast/cheap model for skill synthesis (CV extraction, research, read-later summaries, post-conversation fact extraction) | google-gla:gemini-3-1-flash-lite-preview |
REFLECTION_MODEL_NAME |
Optional override for the nightly reflection agent. If empty, falls back to MODEL_NAME. Set to a larger model for deeper nightly analysis without affecting interactive latency. |
(empty) |
Tools (optional)¶
| Variable | Description |
|---|---|
TAVILY_API_KEY |
Web search — agent degrades gracefully without it |
WEATHERAPI_KEY |
Weather — agent degrades gracefully without it |
GITHUB_TOKEN |
GitHub Personal Access Token with repo + notifications scope |
GOOGLE_MAPS_API_KEY |
Google Maps Platform key (Places, Directions, Geocoding APIs) |
SLACK_BOT_TOKEN |
Slack Bot User OAuth Token (xoxb-...) — bot must be invited to channels it reads |
IDFM_API_KEY |
Île-de-France Mobilités Prim' key for Paris transit status (free at prim.iledefrance-mobilites.fr) |
JIRA_BASE_URL |
Jira Cloud instance URL, e.g. https://yourteam.atlassian.net |
JIRA_EMAIL |
Atlassian account email address |
JIRA_API_TOKEN |
Jira API token from id.atlassian.com/manage-api-tokens |
Semantic Search (optional, but recommended)¶
Semantic search is activated automatically when GOOGLE_API_KEY is set. No additional env var is needed. The same key used for Gemini inference is used for gemini-embedding-001 embeddings.
Postgres: Requires pgvector ≥0.7 on your PostgreSQL instance. Railway's managed Postgres includes a recent pgvector — Kwasi runs CREATE EXTENSION IF NOT EXISTS vector on startup, then ensures embedding columns are halfvec(3072) (auto-migrating any legacy vector(3072) columns in place) and creates HNSW indexes with halfvec_cosine_ops. The 16-bit halfvec type is required because pgvector's HNSW caps at 2000 dims for full-precision vector — Gemini's 3072-dim output exceeds that, and without an index semantic search degrades to a sequential scan. If the extension is unavailable, semantic search falls back silently to keyword-only.
Optional retrieval tuning — message-history mode and thresholds are controlled via env vars (all default to safe values; see app/config.py for the exhaustive list):
| Var | Default | What it does |
|---|---|---|
ENABLE_SEMANTIC_HISTORY |
false |
When true, fetch_message_history() returns SEMANTIC_HISTORY_RECENT_COUNT recent verbatim turns + up to SEMANTIC_HISTORY_SEMANTIC_COUNT semantically-relevant older interactions instead of the chronological last-10. Falls back to chronological on any failure. |
SEMANTIC_HISTORY_RECENT_COUNT |
3 |
How many recent turns kept verbatim when semantic mode is on. |
SEMANTIC_HISTORY_SEMANTIC_COUNT |
3 |
How many semantic hits added when semantic mode is on. |
SEMANTIC_HISTORY_THRESHOLD |
0.6 |
Minimum cosine similarity for a semantic-history hit. |
After first deploy: existing records have no embeddings. Backfill them with:
curl -X POST https://your-app.railway.app/embed-backfill \
-H "X-Reflection-Secret: your-reflection-secret"
Returns {"notes": N, "interactions": N, "read_later": N} — the count of newly embedded rows.
Browser Automation (optional)¶
| Variable | Description |
|---|---|
BROWSER_ALLOWED_DOMAINS |
Comma-separated list of domains the browser tools may visit, e.g. bbc.com,github.com. If unset, all domains are allowed. localhost and private IPs are always blocked. |
Voice & TTS (optional)¶
| Variable | Description | Default |
|---|---|---|
TTS_VOICE |
Microsoft Neural TTS voice name for voice replies | en-GB-RyanNeural |
Background Loops¶
| Variable | Description | Default |
|---|---|---|
BRIEFING_CHAT_ID |
Your Telegram chat ID — activates morning briefing, task notifications, and user scheduled tasks | — |
BRIEFING_TIME |
UTC time for morning briefing (HH:MM) |
08:00 |
BRIEFING_WHATSAPP_NUMBER |
WhatsApp phone number to also receive briefings (E.164 without +, e.g. 33612345678) |
— |
REFLECTION_SECRET |
Any secret string — activates nightly reflection loop and POST /reflect endpoint |
— |
USER_TIMEZONE |
IANA timezone for natural language time parsing and scheduled times — e.g. Africa/Accra, Europe/Paris |
UTC |
EVENING_RECAP_TIME |
Local time for evening recap (HH:MM) |
21:00 |
WEEKLY_RECAP_DAY |
Day of week for weekly recap (monday–sunday) |
friday |
WEEKLY_RECAP_TIME |
Local time for weekly recap (HH:MM) |
18:00 |
WEEKLY_PREP_DAY |
Day of week for weekly prep (monday–sunday) |
sunday |
WEEKLY_PREP_TIME |
Local time for weekly prep (HH:MM) |
18:00 |
READ_LATER_DIGEST_DAY |
Day of week for read-later digest (monday–sunday) |
saturday |
READ_LATER_DIGEST_TIME |
Local time for read-later digest (HH:MM) |
09:00 |
JOURNAL_DIGEST_DAY |
Day of week for weekly journal digest (monday–sunday) |
sunday |
JOURNAL_DIGEST_TIME |
Local time for journal digest (HH:MM) |
19:00 |
EMAIL_INTEL_TIME |
Local time for daily email intelligence triage (HH:MM). Set to "" to disable. |
09:30 |
MEETING_PREP_LEAD_MINUTES |
How many minutes before a meeting to send the prep brief | 30 |
MEETING_PREP_MIN_DURATION_MINUTES |
Minimum meeting duration (minutes) to trigger a prep brief — skips short calls | 10 |
Format validated at startup. All
HH:MMfields must be in 24-hour format (e.g.08:00, not8am). Day fields must be lowercase English day names.USER_TIMEZONEmust be a valid IANA timezone. Invalid values cause the app to refuse to start with a clear error message identifying the offending variable.
Email & Calendar (optional)¶
| Variable | Description |
|---|---|
GMAIL_CLIENT_ID |
Google OAuth2 client ID |
GMAIL_CLIENT_SECRET |
Google OAuth2 client secret |
GMAIL_REFRESH_TOKEN |
Gmail refresh token for personal account (from scripts/get_gmail_token.py) |
GMAIL_WORK_REFRESH_TOKEN |
Gmail refresh token for work account — enables a second set of email tools for the work inbox |
OUTLOOK_CLIENT_ID |
Azure app client ID |
OUTLOOK_CLIENT_SECRET |
Azure app client secret |
OUTLOOK_TENANT_ID |
Azure tenant ID (default: common) |
OUTLOOK_REFRESH_TOKEN |
Outlook refresh token (from scripts/auth_outlook.py). Requires Tasks.ReadWrite scope for Microsoft To Do tools. |
GOOGLE_DRIVE_REFRESH_TOKEN |
Same token as GMAIL_REFRESH_TOKEN — scripts/get_gmail_token.py already requests drive.readonly scope. Just copy the same value. |
GOOGLE_DRIVE_WORK_REFRESH_TOKEN |
Same token as GMAIL_WORK_REFRESH_TOKEN (run scripts/get_gmail_token.py --work). Copy the same value. |
WhatsApp (optional)¶
| Variable | Description |
|---|---|
WHATSAPP_VERIFY_TOKEN |
Webhook verification token (set in Meta developer portal) |
WHATSAPP_ACCESS_TOKEN |
Meta Cloud API access token |
WHATSAPP_PHONE_NUMBER_ID |
Phone number ID from the Meta Cloud API |
WHATSAPP_APP_SECRET |
App secret for webhook signature verification |
ALLOWED_WHATSAPP_NUMBERS |
Comma-separated phone numbers allowed to message (E.164 without +, e.g. 33612345678). If unset, all senders are allowed. |
Android / External API (optional)¶
| Variable | Description |
|---|---|
API_TOKEN |
If set, enables POST /message for Android HTTP Shortcuts and other external clients. Pass via X-API-Token header. |
Using POST /message¶
Accepts either multipart/form-data (file uploads) or application/json (base64 images).
Text message (multipart):
curl -X POST https://your-app.railway.app/message \
-H "X-API-Token: your_token" \
-F "text=What's on my calendar today?"
Text message (JSON):
curl -X POST https://your-app.railway.app/message \
-H "X-API-Token: your_token" \
-H "Content-Type: application/json" \
-d '{"text": "What'\''s on my calendar today?"}'
Image (multipart file upload):
curl -X POST https://your-app.railway.app/message \
-H "X-API-Token: your_token" \
-F "text=What does this say?" \
-F "file=@screenshot.png"
Image (JSON with base64):
curl -X POST https://your-app.railway.app/message \
-H "X-API-Token: your_token" \
-H "Content-Type: application/json" \
-d "{\"text\": \"Analyse this\", \"image_b64\": \"$(base64 -i screenshot.png)\", \"mime_type\": \"image/png\"}"
Response format:
The response is also sent to Telegram (BRIEFING_CHAT_ID). Conversation history is shared with the Telegram interface — both use the same user ID so Kwasi has full context across surfaces.
Dashboard (optional)¶
| Variable | Description |
|---|---|
DASHBOARD_SECRET |
If set, enables /dashboard/ and uses this value as the auth token. Pass via X-Dashboard-Secret header or ?token= query param. |
Health Data Ingest (optional — Spec 010)¶
Required only if you sideload the bridge-android/ app to ingest Samsung Watch / Health Connect data.
| Variable | Description | Default |
|---|---|---|
HEALTH_INGEST_SECRET |
Shared secret for POST /health/ingest and POST /health/backfill. The router refuses to mount if unset (returns 503), so the endpoint is locked down by default. Generate with openssl rand -hex 32. |
— |
HEALTH_BASELINE_DAYS |
Rolling window (in days) used by the agent's health read tools to compute HRV/RHR baselines. | 30 |
Building the bridge: Open bridge-android/ in Android Studio (or run ./gradlew assembleDebug from the directory once the wrapper is generated). Sideload the resulting APK, paste your Kwasi base URL + HEALTH_INGEST_SECRET into the app's settings, grant Health Connect permissions, and tap "Sync now". The bridge runs every 15 min via WorkManager from then on. Full setup notes in bridge-android/README.md.
Backfilling history: If you have a Samsung Health "Download Personal Data" CSV/JSON archive from before the bridge went live, upload it once via:
curl -X POST https://your-app.railway.app/health/backfill \
-H "X-Health-Secret: $HEALTH_INGEST_SECRET" \
-F "file=@samsung_health_export.json"
Observability (optional)¶
Two complementary stacks share one OpenTelemetry tracer provider — see Architecture → Observability. Either can be disabled independently.
| Variable | Description |
|---|---|
LOGFIRE_TOKEN |
Pydantic Logfire token for infra tracing (FastAPI routes, background loops, exceptions). Activates logfire.configure() and logfire.instrument_fastapi(app). |
LOGFIRE_READ_TOKEN |
Logfire read-only token — enables self-diagnosis tools (what errors did you have this week?, which tools are slowest?) via the diagnostics_agent. |
LANGFUSE_PUBLIC_KEY |
Langfuse public key — activates LLM-layer tracing (per-generation token usage + cost, sessions, user grouping, trace scores). Pair with the secret key below. |
LANGFUSE_SECRET_KEY |
Langfuse secret key. |
LANGFUSE_HOST / LANGFUSE_BASE_URL |
Langfuse endpoint (default https://cloud.langfuse.com). Either env-var name is accepted. |
Trace scoring. When Langfuse is enabled, three trace-level scores are emitted: user_approval (Confirm 1.0 / Cancel 0.0 on a PendingAction), user_edit (Edit tap), and agent_error (in-flight exception during a turn). Confirm/Cancel/Edit can fire minutes after the original turn closed — the trace ID is captured at gate time on PendingAction.trace_id so the score still lands on the right trace.
Managed prompts. Nine system prompts can be tuned in the Langfuse UI without redeploying — full list and behaviour in Tools → Langfuse-managed Prompts. Sync workflow:
uv run python scripts/sync_prompts.py --check # CI gate; exit 1 on drift
uv run python scripts/sync_prompts.py --push # code → Langfuse + bump lock
uv run python scripts/sync_prompts.py --pull # Langfuse → code + bump lock
uv run python scripts/sync_prompts.py --pull --dry-run
scripts/langfuse_cleanup.py is a separate operator tool for pruning unused prompts/traces — run interactively when needed.
When LOGFIRE_TOKEN is set, the following spans are recorded:
| Span | Attributes | What it captures |
|---|---|---|
telegram/message |
user_id, chars, categories, agent, type, est_tokens_history, est_tokens_context, est_tokens_user_turn, context_layers, history_interactions |
Full message handling — text, voice, photo, document; includes per-request token breakdown |
whatsapp/message |
user_id, chars, categories |
Full WhatsApp message handling |
api/message |
chars, has_image |
POST /message endpoint (Android HTTP Shortcuts) |
embedding/embed_text |
chars |
Every Gemini embedding API call |
loop/briefing |
— | Morning briefing execution |
loop/evening_recap |
— | Evening recap execution |
loop/weekly_recap |
— | Weekly recap execution |
loop/weekly_prep |
— | Weekly prep execution |
loop/read_later_digest |
— | Read-later digest execution |
loop/journal_digest |
— | Weekly journal digest execution |
loop/email_intel |
— | Daily email intelligence triage |
loop/reflection |
— | Nightly reflection run |
loop/alert_fired |
rule |
When a proactive alert rule fires |
loop/meeting_prep |
event |
When a meeting prep brief is sent |
Token breakdown attributes on telegram/message spans:
| Attribute | What it measures |
|---|---|
est_tokens_history |
Estimated tokens from message_history (conversation context) |
est_tokens_context |
Estimated tokens from XML context blocks (notes/summaries/read-later) |
est_tokens_user_turn |
Estimated tokens for the full enriched user turn (datetime + context + message) |
context_layers |
Number of context injection layers that contributed (0–3) |
history_interactions |
Number of past interactions kept after token-budget trimming |
Pydantic AI's instrument_pydantic_ai() nests tool calls and model calls as child spans under each message span, giving a complete end-to-end trace tree for every interaction.
Local Development¶
# Install dependencies
uv sync
# pgvector is required for production Postgres semantic search
# It is installed automatically via uv sync (listed in pyproject.toml)
# For local SQLite dev, embeddings are stored as JSON — no extension needed
# Run in server mode (Telegram + WhatsApp webhook)
uv run python -m app.main
# Run in CLI mode (no Telegram needed)
uv run python -m app.main --cli
# Run tests
uv run pytest
# Lint
uv run ruff check .
Storage defaults to SQLite (kwasi.db in the project root) when STORAGE_BACKEND is not set.
Finding Your Telegram Chat ID¶
Your Telegram chat ID is the same as your Telegram user ID for direct (non-group) conversations. You can find it by:
- Sending a message to @userinfobot on Telegram
- Or checking the
ALLOWED_TELEGRAM_USER_IDSvalue you already have — for a personal bot, these are the same
Set both:
Web Dashboard¶
If DASHBOARD_SECRET is set, the dashboard is available at /dashboard/. It has four views:
/dashboard/— Recent interactions (last 50)/dashboard/tools— Audit log (tool calls with sanitised arguments)/dashboard/tasks— Scheduled tasks (view + toggle enabled)/dashboard/memory— Current user context profile from the Reflection Engine
Authenticate by passing the secret as a header or query param:
# Header
curl -H "X-Dashboard-Secret: your_secret" https://your-app.railway.app/dashboard/
# Query param (browser friendly)
https://your-app.railway.app/dashboard/?token=your_secret
GitHub Token Setup¶
- Go to GitHub → Settings → Developer settings → Personal access tokens (classic)
- Generate a new token with scopes:
repo,notifications,read:user - Set
GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxxin Railway
Hosting Docs Locally¶
Then open http://localhost:8000.
The docs are deployed automatically to GitHub Pages on every push to main that touches docs/ or mkdocs.yml.