blog/
I spent three months evaluating self-hosted LLM options for a healthcare client generating 6,000 clinical notes per day. Here's what I learned about Qwen, MLX, RunPod, and why the 'just use the API' crowd is often wrong.
Retrieval quality is the bottleneck nobody talks about. Chunking strategies, embedding model choice, and the hybrid search trick that changed everything for our pipeline.
A decision framework for health tech CTOs evaluating whether to build internal AI capabilities or integrate third-party solutions. Spoiler: it depends on your data moat.
We tested four prompt architectures for generating clinical documentation. The winner was first-person voice with a self-audit JSON block — here's the structure and why it works.
Model Context Protocol is powerful but the ecosystem is young. How I'm using MCP for HubSpot, Google Workspace, and custom data sources — and what broke along the way.
Most teams reaching for autonomous agents should be using structured function calling instead. It's more predictable, cheaper, and easier to debug. Here's when each approach wins.