I've been building something over the past few months that I think is worth putting down in this journal, because it sits at the intersection of two things I care deeply about — my business and my belief that small business owners should not be at the mercy of software companies that don't understand what we do.
I've been building my own local AI system. A completely self-hosted one. It runs on my MacBook Pro, it doesn't send a single byte of my business data to anyone's cloud, and the total cost to run it is about a hundred and ninety-one dollars a year. Zero token costs. I've taken to calling it my own Jarvis, and I realize that sounds like something straight out of an Iron Man movie, damn corny I know, but it is the most practical and meaningful thing I have built for my business in a very long time.
Let me explain why I built it.
The software available to most small business owners, especially those of us running specialized operations like senior living communities, is either too generic or too expensive, and usually both. I've tried the big cloud AI tools. They are impressive, and I mean that sincerely. But they are built for everyone, which in practice means they are built for no one in particular. They don't know the difference between a Corrective Action Plan and a medication administration log. They don't know that Golden Pines has specific protocols that differ by facility. They don't know my residents, my team, my regulators, or the day-to-day reality of what it means to operate in senior care in Southeast Michigan.
And so I decided to build my own AI assistant, one that does.
The system I built is what's called a Retrieval-Augmented Generation pipeline, or RAG for short, which is a technical way of saying something that's actually pretty intuitive once you break it down. You take all of your business documents — policies, procedures, training materials, licensing records, corrective action plans, everything — and you break them into smaller pieces, tag them with context, and store them in a way that an AI can search through and actually understand. When someone asks a question, the RAG system finds the most relevant pieces of your own knowledge and uses those to give a real answer, grounded in your actual documents. Not a generic response pulled from the internet. Your knowledge. Your business. I'm simplifying of course, but that's basically the gist of it.
The entire thing runs locally on my laptop using Docker and a tool called Colima to manage it, with a vector database called Qdrant that holds all of the embedded knowledge. For the AI that actually reads and analyzes each document, I'm running Mistral-Nemo locally, a twelve-billion parameter model through a tool called Ollama. When that model can't handle something, it falls back to Claude. I used Claude Code to help me build almost the entire system, which honestly is a story that deserves its own article someday.
But what surprised me most were the lessons I didn't expect to learn.
Early on, I was actually using Google's Gemini as one of my AI models, and it ran up a dollar and three cents in charges before I even knew it was happening. No warning, no auto-cutoff, nothing. I removed it that same day. That one dollar and three cents taught me more about why running a local AI matters than any article I could have read about it.
I also discovered that Apple's built-in Vision framework, the one that ships free on every Mac, outperformed Tesseract — which is pretty widely considered the standard for optical character recognition — on every single test I ran. Scanned PDFs, image-only documents, you name it. Apple Vision handled all of it, at zero cost. It had been sitting there on my machine the whole time, and I didn't even know.
And then there was the insight that I think changed the entire architecture of the system. When you feed a document chunk to an embedding model — that's the AI that converts your text into searchable vectors — it understands natural language far better than it understands structured data. So instead of tagging a document with something like "document type: Corrective Action Plan, facility: Golden Pines," I prepend a full sentence in plain English: "This is a Corrective Action Plan from Golden Pines on Herbmoor dated November 2025. This section addresses medication administration." The AI understands that. It can actually find that when someone asks a question about medication protocols at a specific facility. That single design choice, injecting human-readable context before embedding, was the difference between a system that sort of worked and one that gives my team real answers.
As of right now, I have ingested over nine hundred files and stored nearly thirteen thousand searchable chunks in the database, with about twenty-five hundred files still to process. Each chunk carries over forty metadata fields — the facility, the document type, the people involved, the dates, the sensitivity level, the domain it belongs to. The system knows what it is looking at.
Granted, I'm still building this out. The retrieval layer that will let my Associates and Partners actually query the RAG system in real time is designed but not yet live. I am learning as I go, and there are days where things break and I have to rethink my approach from scratch. But the foundation is solid, and I can see where this is going.
And here is what I keep coming back to. The total annual cost of this entire local AI system is a hundred and ninety-one dollars. About a hundred and eighty for Voyage AI embeddings and reranking, about eleven dollars for Claude as a fallback, and everything else — the vector database, the local language model, the OCR, all seven orchestration scripts — is zero. All of it on a MacBook Pro. For any small business owner who has ever looked at enterprise AI pricing and thought there has to be a better way, there is.
I think there is a blueprint here for other entrepreneurs, especially those of us running brick-and-mortar operations. The big tech companies would have us believe that AI is something you rent from them, on their terms, at their prices, with your data on their servers. But the tools exist right now, today, to build something entirely your own. Something that knows your business because you taught it your business.
And so I will keep building, in this new AI era, even as we continue to take care of our core business of taking care of seniors. I truly believe that falling behind in technology would be a detriment in expanding our operations. We want to make a difference in so many people's lives, and moving forward that means meeting at the intersection of senior care and AI. A local RAG system is just one of many projects we are tackling. Sometimes the tools of taking care of seniors are more technology forward these days.. Tokens will be the new currency in the next decade. And that means running locally can be massive for those of us in Senior Care trying to put technology and AI to work.

