How Buddy Works — The Technology Behind Your AI Assistant

The voice pipeline

Five steps, zero guessing

Every request you make follows the same transparent path. Click through each step to see exactly what's happening.

Voice pipeline walkthrough

🎤

Wake word

Heard its name

📝

Transcription

Speech to text

🧬

Speaker check

Is it you?

🧠

Routing

Local or cloud?

🔊

Response

Buddy speaks

Step 1 — Wake word detected

Buddy runs a tiny, ultra-efficient model called microWakeWord continuously in the background. It uses almost no CPU and listens for just one thing: the phrase "Hey Buddy". Nothing else is recorded or processed until this trigger fires.

microWakeWord · ~1 MB model · runs entirely on-device · <1% CPU

Step 2 — Voice transcribed

Once the wake word fires, your next spoken sentence is captured and converted to text using Faster-Whisper — a highly optimised version of OpenAI's Whisper model. This transcription runs entirely on your device. Your audio never leaves your machine.

Faster-Whisper (small.en) · INT8 quantised · runs locally · no audio upload

Step 3 — Speaker verified

Buddy uses ECAPA-TDNN, a speaker verification model, to compare your voice against your stored voice profile. This prevents anyone else from activating your Buddy — even someone who says "Hey Buddy" in the same room. Your voice profile is stored encrypted on your device only.

ECAPA-TDNN · cosine similarity · voice profile AES-256 encrypted · on-device

Step 4 — Request routed

Phi-3 Mini acts as a smart gatekeeper. It reads your request and instantly decides: can this be answered locally, or does it genuinely need cloud intelligence? Simple requests, personal data queries, and anything privacy-sensitive always stay local. Only complex reasoning tasks are sent to cloud — with your personal details removed first.

Phi-3 Mini 3.8B · 4-bit GGUF · router inference ~60ms · PII scrubbed before cloud

Step 5 — Response spoken

The answer is converted back to natural-sounding speech using Kokoro-82M, a compact text-to-speech model that runs locally. You hear Buddy's voice within milliseconds. The full pipeline — from wake word to spoken answer — typically completes in under two seconds for local requests.

Kokoro-82M TTS · ~82M params · natural prosody · runs on CPU · <200ms latency

Privacy routing

What stays on your device,
and what doesn't

Buddy has a strict two-tier system. Click any example query below to see which tier it falls into and why.

Privacy Tier 1 — Local

Stays entirely on your device · never transmitted

📅 "What's on my calendar tomorrow?"
💡 "Turn off the living room lights"
❤️ "How many steps did I walk today?"
📂 "Find my tax documents from 2023"
📌 "Remind me to call Mum at 6pm"

Privacy Tier 2 — Cloud

Complex reasoning · PII removed before sending

🌍 "Explain quantum entanglement simply"
💻 "Help me debug this Python function"
✈️ "Plan a 10-day trip to Japan"
✍️ "Help me write a formal complaint letter"
🔭 "What's the latest on Mars missions?"

Routing decision

← Click any query above to see the routing decision

PII scrubbing — what gets removed before any cloud request

Raw input

Hi, I'm Alex Johnson and I live at 42 Oak Street, Portland. Can you help me write a letter to my landlord David Chen?

→

Sent to cloud

Hi, I'm [PERSON] and I live at [ADDRESS]. Can you help me write a letter to my landlord [PERSON]?

Memory system

How Buddy remembers you

Buddy has three distinct layers of memory, each serving a different purpose. All of it is encrypted and lives only on your device.

⚡

Short-term memory

A sliding window of your last 5–10 exchanges. This is how Buddy knows what "it" or "that" refers to in the middle of a conversation — no need to repeat yourself.

You What time is sunrise?

Buddy 6:42am tomorrow.

You And sunset?

Buddy 7:18pm. (knows "And" = sunrise query context)

🔒 Session memory · cleared on restart

🗄️

Long-term memory

Stored in a LanceDB vector database on your device. When you ask something, Buddy searches this database for relevant past context and surfaces it automatically — this is called RAG (Retrieval-Augmented Generation).

Vector search example

Query: "flight to Tokyo"

✓ "Japan trip planning" (0.94)

✓ "Passport renewal reminder" (0.87)

✗ "Morning coffee order" (0.11)

🔒 AES-256 · stored on device · never synced

🧩

Entity memory

Facts Buddy learns about you, stored in a structured profile.json. These let Buddy personalise responses without you having to repeat yourself every time.

dietAllergic to peanuts

locationPortland, Oregon

preferencePrefers metric units

work_hours9am – 6pm weekdays

🔒 AES-256 · editable · delete anytime

Privacy-first guarantee

Five promises we keep,
every single time

These aren't marketing claims. They're technical constraints built into how Buddy works.

🎙️

Your voice profile never leaves your device

🔐

Conversation history encrypted with AES-256

🧹

Personal details removed before any cloud request

☁️

Cloud used only when absolutely necessary

🚫

No data sold. No ad tracking. Ever.

What stays local · what (rarely) reaches cloud

🎤

Your voice

Audio input

→

📱

Your device

Wake · transcribe · verify · route

→

🧠

Phi-3 Mini

Local AI · most answers

⊘

Voice · memory · personal data

☁️

Cloud AI

Complex only · PII stripped

FAQ

Common questions

Does Buddy work without internet? +

Yes — and that's a feature, not a fallback. The vast majority of everyday requests (smart home control, calendar, reminders, file access, personal queries) are handled entirely by Phi-3 Mini on your device. Internet is only needed for genuinely complex tasks like deep research or advanced coding help.

Can Buddy hear everything I say? +

No. Buddy runs a tiny wake-word model (microWakeWord) that only recognises the phrase "Hey Buddy". No audio is recorded or processed until that specific phrase is detected. Everything before the wake word is discarded immediately — it never touches memory or storage.

Who else can activate my Buddy? +

Only you. After the wake word fires, Buddy runs speaker verification using ECAPA-TDNN — a neural network that compares the speaker's voice against your stored voice profile. If the voice doesn't match yours above a confidence threshold, the request is rejected silently. Even a recording of your voice is designed to fail verification.

Where is my data stored? +

Everything — your voice profile, conversation history, long-term memories, and entity data — is stored in an encrypted folder on your device. On macOS this is in your Application Support directory. On Windows it's in AppData. All data is encrypted with AES-256. There is no cloud backup, no sync, no server-side storage of any kind.

Which cloud providers does Buddy use? +

For complex requests that genuinely need cloud intelligence, Buddy currently supports OpenAI (GPT-4o) and Google Gemini. You can choose your preferred provider in Settings. In all cases, your personal information — names, addresses, health data, file contents — is removed by the PII scrubber before the request leaves your device.

Can I use Buddy completely offline? +

Yes. Buddy has a dedicated offline mode that routes everything to Phi-3 Mini regardless of complexity. You'll get slightly simpler answers to complex questions, but all core functionality — smart home, calendar, reminders, files, personal queries, memory — works perfectly with no internet connection at all.

How Buddy thinks,
listens, and responds

Five steps, zero guessing

What stays on your device,
and what doesn't

How Buddy remembers you

Short-term memory

Long-term memory

Entity memory

Five promises we keep,
every single time

Common questions

Ready to meet Buddy?

How Buddy thinks,listens, and responds

Five steps, zero guessing

What stays on your device,and what doesn't

How Buddy remembers you

Short-term memory

Long-term memory

Entity memory

Five promises we keep,every single time

Common questions

Ready to meet Buddy?

Be the first to meet Buddy

How Buddy thinks,
listens, and responds

What stays on your device,
and what doesn't

Five promises we keep,
every single time