LLM-on-Spark: Four Patterns That Actually Scale
"Just call the LLM in a loop." 9.6 years later, you finish. Here are the 4 patterns that actually scale to a billion rows: Spark UDFs, Ray+vLLM, warehouse-native SQL, or the Batch API. Code + costs.
"Just call the LLM in a loop." 9.6 years later, you finish. Here are the 4 patterns that actually scale to a billion rows: Spark UDFs, Ray+vLLM, warehouse-native SQL, or the Batch API. Code + costs.
AI already knows more than you ever will. That’s not the advantage anymore. Your edge is simple: ask better questions, get better answers.
We went from 4K token context windows to virtual memory filesystems in four years. Here's the engineering story of how LLM memory evolved - and what you should actually use today.
I run a 19-node LangGraph pipeline serving 20000+ users. I've never written a PyTorch training loop for it. Here's what actually matters - and a 24-week roadmap built around it.
Tools gave agents hands. MCP standardized the wiring. CLIs were there all along. But none of them taught agents how to think about a task. The missing layer turned out to be a markdown file.