Skip to main content
When a WhatsApp message arrives, Whatsy runs it through a fully local pipeline — no cloud services, no external APIs. The macOS app, the background agent, your conversation history, and the Ollama language model all work together on your Mac to decide whether to reply, what to say, and when to send it.

The reply pipeline

Every incoming message follows the same sequence of steps. The pipeline is designed to be predictable: each stage either passes the message forward or stops it, so you always know what Whatsy will do.
1

Message arrives

WhatsApp delivers a new message to Whatsy’s local bridge. The bridge passes it to the background agent for evaluation.
2

Auto-reply check

The agent checks whether auto-reply is enabled in your settings. If it is disabled, the message is logged as skipped and no further action is taken.
3

Behavior rules are evaluated

Your behavior rules are checked in order. The first rule whose condition matches the message wins. Depending on the rule’s action, the message may be skipped, archived, or routed to a specific persona. If no rule matches, the pipeline continues normally.
4

Spam filtering (if enabled)

When spam filtering is on, the agent checks the message text against a built-in list of spam keywords. Matched messages are skipped and logged.
5

Conversation context is loaded

The agent reads recent messages for this conversation from your local history. This context window (up to your configured buffer size) gives the language model enough background to produce a coherent reply, not just a response to a single sentence.
6

Persona prompt is assembled

Your active persona’s markdown instructions and example conversation pairs are combined with the conversation context to form the prompt that will be sent to Ollama.
7

Ollama generates the reply

The prompt is sent to Ollama running locally on your Mac. The reply is generated by your chosen model (default: llama3.2) and returned to the agent. Your messages never leave your machine.
8

Randomized delay is applied

Before sending, the agent waits a random amount of time between your configured minimum and maximum delay (default: 30–180 seconds). This makes replies feel natural rather than instantaneous.
9

Reply is sent

The generated reply is sent through your WhatsApp account via the local bridge. The reply and all activity details are saved to your local history.
If Ollama returns SKIP as its output — meaning the persona determined no reply is needed — the agent logs the decision and sends nothing. The conversation is not closed; future messages will still be processed normally.

What runs on your Mac

Every component in the Whatsy stack runs locally. There is no hosted backend.

macOS App

The Swift/SwiftUI application that lives in your menu bar. It provides the settings UI, the activity dashboard, and persona management. It communicates with the background agent over a local connection that never leaves your machine.

Background Agent

The long-lived process that handles all message logic: evaluating rules, building prompts, calling Ollama, and sending replies. It is started by the macOS app and keeps running in the background even when the main window is closed.

Ollama

A local LLM runtime that serves the language model on your Mac. It receives the assembled prompt and returns the generated reply text. Your messages never leave your machine.

Local Storage

All conversation history, activity logs, and contact data are stored locally inside ~/Library/Application Support/Whatsy/data/. Nothing is synced to a remote server.

The background agent

When you launch Whatsy, the macOS app starts the background agent automatically. The agent waits for incoming messages from the WhatsApp bridge and processes them according to your settings.
Closing the main Whatsy window does not stop the agent. As long as the app is running in the menu bar, the agent continues to receive and reply to messages. To stop all activity, quit Whatsy from the menu bar icon.
The macOS app and the agent stay in sync through a local connection. When you change a setting — such as toggling auto-reply or swapping the active persona — the app writes the new value to ~/Library/Application Support/Whatsy/data/config.json and the agent reloads its configuration immediately.

The context window

Whatsy does not reply to messages in isolation. Before generating a reply, the agent loads the most recent messages from the conversation — up to the buffer size you set (default: 30 messages). These messages are passed to Ollama as context so the model understands the flow of the conversation. This means replies can naturally reference what was said earlier in the thread, pick up on tone shifts, and avoid repeating information. The context is stored entirely on your Mac and is pruned automatically once the buffer limit is reached, keeping storage usage bounded.