
How to Connect WhatsApp to Ollama (Local AI)
Run your WhatsApp chatbot entirely on your own machine — no API keys, no cloud costs, no data leaving your network. Ollama makes it easy to run open-source models like Llama, Gemma, and Mistral locally.
Prerequisites
- A paired whatsmeow-node session (How to Pair)
- Ollama installed and running
- A model pulled:
ollama pull llama3.2 - The Ollama SDK:
npm install ollama
Step 1: Pull a Model
# Install Ollama from https://ollama.com, then:
ollama pull llama3.2
Other good choices for chat:
gemma3— Google's open model, fast and capablemistral— Strong for its sizellama3.2:1b— Smallest Llama, fastest responses
Step 2: Set Up Both Clients
import { createClient } from "@whatsmeow-node/whatsmeow-node";
import { Ollama } from "ollama";
const client = createClient({ store: "session.db" });
const ollama = new Ollama({ host: "http://localhost:11434" });
const MODEL = "llama3.2";
const SYSTEM_PROMPT = "You are a helpful WhatsApp assistant. Keep responses concise — under 500 characters when possible, since this is a chat interface.";
Step 3: Handle Incoming Messages
client.on("message", async ({ info, message }) => {
if (info.isFromMe) return;
const text =
(message.conversation as string) ??
(message.extendedTextMessage as { text?: string } | undefined)?.text;
if (!text) return;
await client.sendChatPresence(info.chat, "composing");
const reply = await askOllama(info.sender, text);
await client.sendMessage(info.chat, { conversation: reply });
});
Step 4: Send to Ollama
async function askOllama(userJid: string, userMessage: string): Promise<string> {
const response = await ollama.chat({
model: MODEL,
messages: [
{ role: "system", content: SYSTEM_PROMPT },
{ role: "user", content: userMessage },
],
});
return response.message.content;
}
Step 5: Add Conversation History
import type { Message } from "ollama";
const conversations = new Map<string, Message[]>();
const MAX_HISTORY = 20;
async function askOllama(userJid: string, userMessage: string): Promise<string> {
const history = conversations.get(userJid) ?? [];
history.push({ role: "user", content: userMessage });
if (history.length > MAX_HISTORY) {
history.splice(0, history.length - MAX_HISTORY);
}
const response = await ollama.chat({
model: MODEL,
messages: [{ role: "system", content: SYSTEM_PROMPT }, ...history],
});
const reply = response.message.content;
history.push({ role: "assistant", content: reply });
conversations.set(userJid, history);
return reply;
}
Complete Example
import { createClient } from "@whatsmeow-node/whatsmeow-node";
import { Ollama } from "ollama";
import type { Message } from "ollama";
const client = createClient({ store: "session.db" });
const ollama = new Ollama({ host: "http://localhost:11434" });
const MODEL = "llama3.2";
const SYSTEM_PROMPT =
"You are a helpful WhatsApp assistant. Keep responses concise — under 500 characters when possible, since this is a chat interface.";
const conversations = new Map<string, Message[]>();
const MAX_HISTORY = 20;
async function askOllama(userJid: string, userMessage: string): Promise<string> {
const history = conversations.get(userJid) ?? [];
history.push({ role: "user", content: userMessage });
if (history.length > MAX_HISTORY) {
history.splice(0, history.length - MAX_HISTORY);
}
const response = await ollama.chat({
model: MODEL,
messages: [{ role: "system", content: SYSTEM_PROMPT }, ...history],
});
const reply = response.message.content;
history.push({ role: "assistant", content: reply });
conversations.set(userJid, history);
return reply;
}
client.on("message", async ({ info, message }) => {
if (info.isFromMe) return;
const text =
(message.conversation as string) ??
(message.extendedTextMessage as { text?: string } | undefined)?.text;
if (!text) return;
console.log(`${info.pushName}: ${text}`);
await client.sendChatPresence(info.chat, "composing");
try {
const reply = await askOllama(info.sender, text);
await client.sendMessage(info.chat, { conversation: reply });
console.log(`→ ${reply.slice(0, 80)}...`);
} catch (err) {
console.error("Ollama error:", err);
await client.sendMessage(info.chat, {
conversation: "Sorry, I'm having trouble right now. Make sure Ollama is running.",
});
}
});
client.on("logged_out", ({ reason }) => {
console.error(`Logged out: ${reason}`);
client.close();
process.exit(1);
});
async function main() {
const { jid } = await client.init();
if (!jid) {
console.error("Not paired! See: How to Pair WhatsApp");
process.exit(1);
}
await client.connect();
await client.sendPresence("available");
console.log(`Ollama bot is online! (model: ${MODEL})`);
process.on("SIGINT", async () => {
await client.sendPresence("unavailable");
await client.disconnect();
client.close();
process.exit(0);
});
}
main().catch(console.error);
Common Pitfalls
Make sure the Ollama server is running (ollama serve) before starting the bot. If it's not running, all requests will fail.
Local models run on your CPU/GPU. Smaller models like llama3.2:1b respond in 1-3 seconds on modern hardware. Larger models may take 10+ seconds — the typing indicator keeps the user informed while they wait.
Always check info.isFromMe first. Without this, the bot replies to its own messages forever.
Run ollama pull llama3.2 before starting the bot. If the model isn't downloaded, requests will fail.







