Private AI systems / local inference / retrieval architecture

JAG AI

I design and build AI systems that can survive contact with real work: private retrieval, local inference, agent memory, automation, and interfaces people can actually trust.

The focus is practical architecture, not demos for demo's sake: traceable answers, measurable retrieval quality, controlled data boundaries, and workflows that reduce friction instead of adding another dashboard to babysit.

~/portfolio/what-i-build.log ONLINE

What I Build

  • Agentic workflows with explicit tools, guardrails, memory, and failure handling so automation can take on real operational tasks without becoming opaque.
  • Hybrid RAG systems that combine semantic search, exact matching, reranking, metadata, and source traceability so answers can be inspected instead of merely trusted.
  • Evaluation loops that make regressions visible: representative test sets, retrieval comparisons, answer-grounding checks, and clear "what changed?" debugging.
  • Local-first and privacy-forward systems for data that should not casually leave the machine, network, or organization that owns it.
~/portfolio/featured-work.json 3_ENTRIES

Featured AI Work

Project S

A private document-intelligence pipeline built around vector database infrastructure, custom conversion, visual extraction, local models, hybrid retrieval, and reranking. The point is not just to find chunks, but to preserve enough context that the answer can be defended.

Read Case Study →

LeXpand

A local-first text expansion workflow for people who repeat the same careful language all day: support replies, technical notes, status updates, templates, and everyday browser writing.

View Landing Page →

cloudflared-rdp-ssh-locally-managed

Guide-driven project for locally managed Cloudflare Tunnel configuration and simpler RDP/SSH setup workflows. Focused on repeatable setup, security boundaries, and local ownership.

GitHub Repository →

More projects: Projects Page →

~/portfolio/architecture-stack.db COMPILING

AI Architecture Stack

  • Vector search infrastructure: vector databases, dense embeddings, sparse retrieval, hybrid scoring, and metadata filters tuned for the shape of the corpus.
  • Ingestion: automatic document conversion with metadata preservation, chunking strategy, visual context extraction, and repeatable rebuilds.
  • Models: open source LLMs and VLMs running locally when privacy, cost control, or offline capability matter more than a hosted default.
  • Ranking: reranking and evaluation passes aimed at precision, recall, answer grounding, and the practical question of whether users get the right source at the right time.
~/portfolio/contact.sh LISTENING

Work With JAG AI

For teams that need AI work to be more than a prototype: private retrieval, automation design, local model workflows, support tooling, and systems that can be explained after they ship.

Get In Touch      About Alex