
How Does Dume.ai Work Under the Hood?

Team Dume.ai
Jun 21, 2025 • 3 min read
In a world drowning in emails, tasks, meetings, and docs, AI assistants like Dume.ai are becoming essential tools for knowledge workers. But have you ever wondered what’s happening under the hood?
How does Dume.ai extract action items from your inbox, summarize your meetings, and automate workflows across Gmail, Notion, Jira, and more?
In this post, we’ll dive into the technical architecture, the AI pipeline, and the system design decisions powering Dume.ai—giving you a detailed look at how AI assistants work.
What Is Dume.ai?
Before we go deep into internals, here’s a quick overview.
Dume.ai is a personal AI agent that connects to your digital workspaces—email, calendar, docs, and tasks—and lets you chat with your work. It fetches, summarizes, organizes, and automates—all through a conversational interface powered by models like GPT-4, Gemini, and Claude.
High-Level Architecture
Dume.ai follows a modular microservices-based architecture:
- Frontend: Next.js (React-based)
- Backend: Node.js + TypeScript
- AI Workers: Event-driven job queues
- Storage: PostgreSQL + Redis
- Vector DB: Weaviate or pgvector
- AI Stack: OpenAI, Gemini, Claude, LangChain, LlamaIndex
- OAuth2.0: Secure integration with 3rd-party tools
Here’s what this looks like in layers:
Architecture Diagram
Data Integration Layer
Dume.ai begins by connecting securely with your tools:
- ✅ Gmail via Gmail API
- ✅ Google Calendar for meetings
- ✅ Notion for notes and tasks
- ✅ Jira & Confluence via Atlassian APIs
Powered by OAuth 2.0
Users grant access to specific scopes. We normalize all fetched content into a unified internal data schema. This allows consistent parsing and AI processing regardless of the source.
Performance Optimizations
- Webhooks + polling fallback
- Delta sync using sync tokens
- Queue-based ingestion system for large orgs
Embedding & Vector Search
After ingestion, Dume.ai transforms content into embeddings using models like:
text-embedding-3-small
(OpenAI)bge-small-en-v1.5
(HuggingFace)
Chunks are created using recursive text splitting and stored in:
- Weaviate (primary)
- or PostgreSQL with pgvector (fallback)
This powers Dume.ai’s RAG system—Retrieval-Augmented Generation—for grounding large language models in your actual work data.
AI Orchestration with LangChain
We use LangChain to route tasks to specialized agents:
Task | Agent | Model |
---|---|---|
Email Reply | Email Agent | GPT-4 |
Digest Summary | Calendar Agent | Gemini Pro |
Task Extraction | ActionAgent | Claude 3 |
These chains handle:
- Document retrieval
- Prompt templating
- Output parsing
- Function calling for structured responses
Dume.ai also supports tool use and multi-step reasoning, enabled via RouterChain
+ ToolCallingRouter
.
Local AI in the Browser (WebLLM)
Dume.ai supports local inference using WebLLM, which runs models like:
phi-2
mistral-7b
tinyllama
Right in your browser.
This means:
- 🔐 100% privacy (no cloud call)
- ⚡ Instant inference (for light tasks)
- 🧪 Experimental local agents (offline mode)
Message Pipeline and Execution Flow
When you ask Dume.ai something like:
“Summarize today’s meetings and send a reminder to the team.”
Here’s what happens:
- Intent Classifier → Extracts that it’s a multi-intent prompt
- Retriever → Pulls today’s calendar events
- Agent Router → Chooses
MeetingSummarizerAgent
- LLM Call → Sends prompt + context to Gemini or Claude
- Tool Use → Triggers email reminder action via Gmail API
- Formatter → Sends back response as Markdown + actions
Automation Engine (Agent Loop)
Dume.ai is not just reactive—it can run agent loops like:
- “Plan my week”
- “Organize my documents”
- “Summarize unread emails every morning”
Powered by:
- ✅ Planning Loop (inspired by AutoGPT)
- ✅ Task Breakdown (LLM + Rule-based)
- ✅ Execution via APIs or Scripts
- ✅ Self-monitoring via logs
This is our foundation for autonomous AI agents.
Privacy & Observability
Dume.ai is designed with enterprise-grade data security:
- PII Redaction filters before embedding
- User-level consent control
- Action audit trails
- Zero retention mode for sensitive orgs
- Real-time dashboards and logs
Full Tech Stack
Layer | Stack |
---|---|
Frontend | Next.js, Tailwind, Vercel |
Backend | Node.js, Fastify, tRPC |
Vector DB | Weaviate, pgvector |
AI Orchestration | LangChain, OpenAI, Claude, Gemini |
Local AI | WebLLM |
Workers | BullMQ, Redis |
DB | PostgreSQL |
Hosting | Vercel (FE), GCP (BE), Railway (DB) |
Why This Matters
Understanding how Dume.ai works shows just how powerful and secure modern AI assistants can be. We don’t just chat with LLMs—we orchestrate your work using:
- Vector databases
- Autonomous agents
- Local and cloud inference
- Multi-tool API integrations
It’s the future of productivity—designed for professionals who want AI that works like a teammate.