Private AI systems / local inference / retrieval architecture

JAG AI

I design and build AI systems that can survive contact with real work: private retrieval, local inference, agent memory, automation, and interfaces people can actually trust.

The focus is practical architecture, not demos for demo's sake: traceable answers, measurable retrieval quality, controlled data boundaries, and workflows that reduce friction instead of adding another dashboard to babysit.

What I Build

Agentic workflows with explicit tools, guardrails, memory, and failure handling so automation can take on real operational tasks without becoming opaque.
Hybrid RAG systems that combine semantic search, exact matching, reranking, metadata, and source traceability so answers can be inspected instead of merely trusted.
Evaluation loops that make regressions visible: representative test sets, retrieval comparisons, answer-grounding checks, and clear "what changed?" debugging.
Local-first and privacy-forward systems for data that should not casually leave the machine, network, or organization that owns it.

Featured AI Work

Project S

A private document-intelligence pipeline built around Qdrant, custom conversion, visual extraction, local models, hybrid retrieval, and reranking. The point is not just to find chunks, but to preserve enough context that the answer can be defended.

Read Case Study

LeXpand

A local-first text expansion workflow for people who repeat the same careful language all day: support replies, technical notes, status updates, templates, and everyday browser writing.

View Landing Page

Architecture Notes

A reusable structure for explaining AI projects in a way that stays honest: the problem, constraints, design decisions, retrieval quality, evaluation strategy, tradeoffs, and the evidence still needed.

Case Study Template

More projects: Projects Page

AI Architecture Stack

Vector search: Qdrant, dense embeddings, sparse retrieval, hybrid scoring, and metadata filters tuned for the shape of the corpus.
Ingestion: automatic document conversion with metadata preservation, chunking strategy, visual context extraction, and repeatable rebuilds.
Models: open source LLMs and VLMs running locally when privacy, cost control, or offline capability matter more than a hosted default.
Ranking: reranking and evaluation passes aimed at precision, recall, answer grounding, and the practical question of whether users get the right source at the right time.

Work With JAG AI

For teams that need AI work to be more than a prototype: private retrieval, automation design, local model workflows, support tooling, and systems that can be explained after they ship.

Contact · About Alex