The AI hype cycle is exhausting and the real engineering questions are still hard: what should you actually use it for, where does your data go, how do you make sure it doesn't go off the rails, and what do you do when the vendor doubles their price next year? I help organizations cut through the noise and build AI capability that holds up after the demo.

My approach favors solutions you can own and adapt over black-box dependencies. Open models where they make sense, hosted models where they don't, and clear documentation either way. The goal is durable capability, not vendor lock-in dressed up as innovation.

Engagements range from one-day strategy and policy workshops up through hands-on implementation of retrieval pipelines, local model deployments, and AI-assisted development workflows for engineering teams.

What I'm Seeing in the Field

The questions I'm hearing are pretty consistent across smaller organizations.

Schools want to know whether teachers can use ChatGPT with student work, what the data-handling implications actually are under FERPA, and how to write an AI use policy that holds up in practice. Most are doing this ad hoc and would rather have a thoughtful answer.

Small businesses are looking at things like ticket triage, customer-email summarization, knowledge-base search, and document extraction. They don't need an AI strategy deck. They need to know which two or three projects are actually worth building this quarter.

Non-profits want help thinking through donor data, member outreach, and grant-writing assistance, usually with strong privacy concerns. The honest answer often involves self-hosting, which most consultants won't recommend because it's harder to bill against.

Tools I Reach For

For local model deployment: Ollama, llama.cpp, and vLLM, with various models depending on the workload. For organizations with privacy or cost reasons to keep inference in-house, the open ecosystem is finally good enough to take seriously.

For retrieval-augmented generation: I prefer keeping the retrieval layer transparent and inspectable rather than burying it in a framework. Postgres with pgvector handles the vast majority of small-to-medium retrieval workloads without standing up a separate vector database. If you're in AWS already, we'll probably talk about Bedrock.

For agent-shaped work: small, observable, retryable pipelines first. Agent frameworks only after you've proven the work is worth automating and the prompts hold up under load. Anything an agent can touch is part of your attack surface, so least privilege and treating model input as untrusted matter from day one, not after the breach.

For evaluation: simple harnesses with promptfoo or hand-rolled scoring before anything fancy. You can't improve what you can't measure, and most teams skip this step entirely.

What I Won't Help You Do

A short, honest list:

Train your own foundation model. You don't need to, and the organization you're imagining who'd benefit doesn't exist for you.
Wire your CRM directly into a public LLM. There's a way to do this safely. The obvious way isn't it.
Stand up an agent framework before you've written a working prompt by hand. Agents don't fix bad prompts, they multiply them.
Replace humans where humans are doing important work. Augment them, instrument them, give them better tools, fine. Replace them, usually no.

How I Can Help

📋

Policy

AI use policies, acceptable-use guidelines, data-handling rules, and the governance scaffolding boards and auditors are starting to ask for

🧭

Organization Adoption

Rollout planning, training, change management, and the unglamorous work of getting AI tools into the hands of people who'll actually use them well

🏠

Local AI

On-prem and self-hosted model deployments when public APIs aren't an option, whether for privacy, compliance, or cost

🔎

RAG Deployment

Retrieval-augmented generation systems that ground answers in your own documents, knowledge bases, and structured data, with the ingestion and evaluation plumbing to keep them honest

⚙️

Processing Pipelines

Batch and streaming AI pipelines for classification, extraction, summarization, and enrichment, designed to be observable, retryable, and replaceable when better models land

✨

AI-Assisted Development (Vibe Coding)

Tooling, guardrails, and team practices for using Claude Code, Cursor, and similar agents to ship real software, not just impressive demos

AI Consulting