v2.0 · Streaming replies · Desktop app for Windows/macOS/Linux

One intelligent AI, everywhere you communicate

CortexFlow-AI connects a single AI agent to all your messaging platforms — with smarter 3-tier memory, task-aware LLM routing, token-by-token streaming replies, and voice that actually works. Self-hosted. Privacy-first. No cloud required.

$pip install cortexflow-ai
$cortex init# guided setup — model + 1 channel
$cortex start --background
✓ Gateway live on 127.0.0.1:7432 — your agent is online.
14
Channel Adapters
3-Tier
Memory Pipeline
5
LLM Providers
1200+
Automated Tests
99.7%
Test Coverage
Features

Everything a personal AI assistant needs

Built from the ground up to be smarter, more private, and more extensible than the alternatives.

Multi-Channel Gateway

Telegram, Discord, Slack, WhatsApp, Email, SMS, Matrix, IRC, Signal, Webhook, Mastodon, Teams, Mattermost, Nextcloud — one agent, every platform.

3-Tier Memory

Redis short-term context, Qdrant semantic search, SQLite long-term persistence — shared across every channel, auto-pruned by importance.

Task-Aware LLM Routing

Claude for deep reasoning, Gemini Flash for speed, DeepSeek for code, GPT-4, or Ollama for fully offline/private — with automatic fallback chains.

Voice Everywhere

Local Whisper STT, ElevenLabs/Kokoro/system TTS, open-source wake-word detection, and full voice-note round trips on Telegram and Discord.

Reflection Engine

Every response is quality-scored before it reaches you — low-quality answers are automatically regenerated with corrective guidance.

Typed Plugin SDK

A dependency-free cortexflow-sdk package for building tools, channel adapters, and plugins — sandboxed, typed, pip-installable.

Streaming Replies

Token-by-token streaming for all 5 LLM providers — Claude, Gemini, DeepSeek, GPT-4, and Ollama each stream natively, no chunking-after-the-fact.

Native Desktop App

A real Windows/macOS/Linux app — system tray, global hotkey, auto-start on login, and native notifications. Same dashboard, wrapped in Tauri.

Desktop App

A native app for Windows, macOS, and Linux

The same dashboard you'd run in a browser, wrapped as a real desktop app — system tray, a global hotkey, auto-start on login, and native notifications.

Windows

No admin rights required

Download for Windows or the .msi installer

macOS

Apple Silicon

Download for macOS

Linux

AppImage — runs on most distros

Download for Linux or the .deb package
Ctrl+Shift+Space Summon the window from anywhere, instantly — even minimized to the tray.
Native notifications Know when the agent replies, even when the window isn't focused.
Auto-start on login Always running in the background — no manual launch, no admin prompt.
System tray + unread badges Lives quietly in the tray, with a live unread count per connected channel.
How It Works

From message to reply in five steps

Every channel — Telegram, Discord, Slack, or the REST API — flows through the same gateway pipeline.

1

Message Arrives

A channel adapter (Telegram, Discord, Slack, …) receives an inbound message and normalizes it.

2

Memory Retrieval

The session pipeline pulls recent context from Redis, relevant facts from Qdrant, and history from SQLite.

3

Model Routing

The task-aware router picks Claude, Gemini, DeepSeek, GPT-4, or local Ollama based on complexity and privacy mode.

4

Reflection

The response is quality-scored before delivery; low-quality answers are regenerated with corrective guidance.

5

Stream Reply & Store

The reply streams back token-by-token through the originating channel as it's generated, and the exchange is written back into memory.

See It In Action

Talk to it however you like

Same agent, same memory, three ways in: a chat channel, the REST API, or your own plugin.

// Connect to the gateway and send a chat message — replies stream
// in token-by-token, not as one final blob
const ws = new WebSocket("ws://127.0.0.1:7432/ws");

ws.onopen = () => {
  ws.send(JSON.stringify({
    type: "message",
    id: "msg-1",
    text: "Summarize my last 3 conversations"
  }));
};

ws.onmessage = (event) => {
  const frame = JSON.parse(event.data);
  // frame.type: "hello" | "message_chunk" | "message_done" | "error"
  if (frame.type === "message_chunk") process.stdout.write(frame.delta);
  if (frame.type === "message_done") console.log("\n[done]", frame.text);
};
# Search memory across all three tiers
curl -s "http://127.0.0.1:7432/api/v1/memory/search?q=portfolio&limit=5" | jq

# Check gateway + channel status
curl -s http://127.0.0.1:7432/api/v1/status | jq
curl -s http://127.0.0.1:7432/api/v1/channels | jq
from cortexflow_sdk import Tool, ToolResult

class WeatherTool(Tool):
    name = "get_weather"
    description = "Look up current weather for a city."

    async def run(self, city: str) -> ToolResult:
        try:
            data = await self._fetch(city)
            return ToolResult.ok(data)
        except Exception as exc:
            return ToolResult.error(str(exc))

# pip install cortexflow-sdk
Why CortexFlow-AI

How it beats OpenClaw

OpenClaw popularized the personal-AI-gateway idea. CortexFlow-AI goes further on the dimensions that matter most.

Dimension OpenClaw CortexFlow-AI
Memory LanceDB only (flat vector) Redis + Qdrant + SQLite (3-tier)
LLM Routing Manual model config Auto task-aware routing + fallback chains
Voice macOS/iOS wake-word only Cross-platform STT/TTS + open-source wake word
Web UI Static WebChat widget Full dashboard: memory explorer, history, metrics
Configuration Complex YAML (~50 keys) Simple TOML, works in 3 lines
Observability Stdout logs only Structured JSON logs + Prometheus metrics
Plugin Security In-process, no sandboxing Subprocess-sandboxed, typed SDK
Architecture

A single gateway, four cooperating systems

Local-first by design — runs as one daemon on your own machine or server, no required cloud dependency.

Clients
Web UI · Desktop App (Tauri) · CLI (cortex) · REST API · 14 Channel Platforms
CortexFlow-AI Gateway
FastAPI + WebSocket daemon · ws://127.0.0.1:7432
Channel Manager
Per-channel session isolation
Model Router
Claude · Gemini · DeepSeek · GPT-4 · Ollama
Memory Pipeline
Redis · Qdrant · SQLite
Voice + Plugins
Whisper/TTS · Sandboxed SDK
Packages

Published on PyPI today

The gateway itself, the plugin SDK, and three example plugins are all real, installable packages.

cortexflow-ai

cortexflow-ai version on PyPI

The full gateway — multi-channel daemon, 3-tier memory, model routing, voice, and the cortex CLI. Business Source License 1.1 (free for non-production use).

pip install cortexflow-ai View on PyPI →

cortexflow-sdk

cortexflow-sdk version on PyPI

Typed Plugin, Tool, and ChannelAdapter base classes — zero gateway dependencies.

pip install cortexflow-sdk View on PyPI →

cortexflow-github

cortexflow-github version on PyPI

Lists recent GitHub repository events (pushes, PRs, issues) via the public REST API.

pip install cortexflow-github View on PyPI →

cortexflow-notion

cortexflow-notion version on PyPI

Searches Notion pages and databases shared with your integration.

pip install cortexflow-notion View on PyPI →

cortexflow-google-calendar

cortexflow-google-calendar version on PyPI

Lists upcoming Google Calendar events for the connected account.

pip install cortexflow-google-calendar View on PyPI →
REST & WebSocket API

A small, predictable API surface

No SDK required for the gateway itself — every route is plain JSON over HTTP and one WebSocket connection.

GET /api/v1/status Gateway health, uptime, version
GET /api/v1/channels List connected channels
POST /api/v1/channels/{id}/send Send a test message
GET /api/v1/memory/search Search across all 3 memory tiers
GET /api/v1/memory/entries List long-term memory entries
PATCH /api/v1/memory/entries/{id} Edit content or importance
DELETE /api/v1/memory/entries/{id} Delete a memory entry
GET /api/v1/sessions List active sessions
GET /api/v1/metrics/snapshot Prometheus metrics, JSON form
WS /ws Chat + session event stream

Full API reference

Every route above — request/response shapes, auth (there is none; it's a local single-user daemon), and WebSocket frame types — is documented with real examples cross-checked against the gateway source.

Read the API docs →
Quickstart

Running in under a minute

Works with three lines of config: a model and one channel token.

# 1. Install — published on PyPI
pip install cortexflow-ai

# 2. Guided setup wizard — model + channel + voice test
cortex init

# 3. Start the gateway daemon
cortex start --background

# 4. Or just talk to it right now from the terminal
cortex chat
# No Python setup needed — pull the public image
docker pull ghcr.io/theamitchandra/cortexflow-ai:latest

docker run -d --name cortexflow-ai -p 7432:7432 \
  -e ANTHROPIC_API_KEY=sk-ant-... \
  -v cortexflow-data:/root/.cortexflow \
  ghcr.io/theamitchandra/cortexflow-ai:latest

curl http://localhost:7432/health