How Does Dume.ai Work Under the Hood? - Dume AI

In a world drowning in emails, tasks, meetings, and docs, AI assistants like Dume.ai are becoming essential tools for knowledge workers. But have you ever wondered what’s happening under the hood?

How does Dume.ai extract action items from your inbox, summarize your meetings, and automate workflows across Gmail, Notion, Jira, and more?

In this post, we’ll dive into the technical architecture, the AI pipeline, and the system design decisions powering Dume.ai—giving you a detailed look at how AI assistants work.

What Is Dume.ai?

Before we go deep into internals, here’s a quick overview.

Dume.ai is a personal AI agent that connects to your digital workspaces—email, calendar, docs, and tasks—and lets you chat with your work. It fetches, summarizes, organizes, and automates—all through a conversational interface powered by models like GPT-4, Gemini, and Claude.

High-Level Architecture

Dume.ai follows a modular microservices-based architecture:

Frontend: Next.js (React-based)
Backend: Node.js + TypeScript
AI Workers: Event-driven job queues
Storage: PostgreSQL + Redis
Vector DB: Weaviate or pgvector
AI Stack: OpenAI, Gemini, Claude, LangChain, LlamaIndex
OAuth2.0: Secure integration with 3rd-party tools

Here’s what this looks like in layers:

Architecture Diagram

Data Integration Layer

Dume.ai begins by connecting securely with your tools:

✅ Gmail via Gmail API
✅ Google Calendar for meetings
✅ Notion for notes and tasks
✅ Jira & Confluence via Atlassian APIs

Powered by OAuth 2.0

Users grant access to specific scopes. We normalize all fetched content into a unified internal data schema. This allows consistent parsing and AI processing regardless of the source.

Performance Optimizations

Webhooks + polling fallback
Delta sync using sync tokens
Queue-based ingestion system for large orgs

Embedding & Vector Search

After ingestion, Dume.ai transforms content into embeddings using models like:

text-embedding-3-small (OpenAI)
bge-small-en-v1.5 (HuggingFace)

Chunks are created using recursive text splitting and stored in:

Weaviate (primary)
or PostgreSQL with pgvector (fallback)

This powers Dume.ai’s RAG system—Retrieval-Augmented Generation—for grounding large language models in your actual work data.

AI Orchestration with LangChain

We use LangChain to route tasks to specialized agents:

Task	Agent	Model
Email Reply	Email Agent	GPT-4
Digest Summary	Calendar Agent	Gemini Pro
Task Extraction	ActionAgent	Claude 3

These chains handle:

Document retrieval
Prompt templating
Output parsing
Function calling for structured responses

Dume.ai also supports tool use and multi-step reasoning, enabled via RouterChain + ToolCallingRouter.

Local AI in the Browser (WebLLM)

Dume.ai supports local inference using WebLLM, which runs models like:

phi-2
mistral-7b
tinyllama

Right in your browser.

This means:

🔐 100% privacy (no cloud call)
⚡ Instant inference (for light tasks)
🧪 Experimental local agents (offline mode)

Message Pipeline and Execution Flow

When you ask Dume.ai something like:

“Summarize today’s meetings and send a reminder to the team.”

Here’s what happens:

Intent Classifier → Extracts that it’s a multi-intent prompt
Retriever → Pulls today’s calendar events
Agent Router → Chooses MeetingSummarizerAgent
LLM Call → Sends prompt + context to Gemini or Claude
Tool Use → Triggers email reminder action via Gmail API
Formatter → Sends back response as Markdown + actions

Automation Engine (Agent Loop)

Dume.ai is not just reactive—it can run agent loops like:

“Plan my week”
“Organize my documents”
“Summarize unread emails every morning”

✅ Planning Loop (inspired by AutoGPT)
✅ Task Breakdown (LLM + Rule-based)
✅ Execution via APIs or Scripts
✅ Self-monitoring via logs

This is our foundation for autonomous AI agents.

Privacy & Observability

Dume.ai is designed with enterprise-grade data security:

PII Redaction filters before embedding
User-level consent control
Action audit trails
Zero retention mode for sensitive orgs
Real-time dashboards and logs

Full Tech Stack

Layer	Stack
Frontend	Next.js, Tailwind, Vercel
Backend	Node.js, Fastify, tRPC
Vector DB	Weaviate, pgvector
AI Orchestration	LangChain, OpenAI, Claude, Gemini
Local AI	WebLLM
Workers	BullMQ, Redis
DB	PostgreSQL
Hosting	Vercel (FE), GCP (BE), Railway (DB)

Why This Matters

Understanding how Dume.ai works shows just how powerful and secure modern AI assistants can be. We don’t just chat with LLMs—we orchestrate your work using:

Vector databases
Autonomous agents
Local and cloud inference
Multi-tool API integrations

It’s the future of productivity—designed for professionals who want AI that works like a teammate.