Akshay's Expedition Logs
Deep dives into distributed systems, ML infrastructure, and the architectures that make AI at scale actually work
Recent Posts

Distributed Systems
•2026-03-15
Production Ray: Patterns, Mistakes, and Lessons
The final part of our Ray deep dive: five production patterns that work, five mistakes that hurt, and a debugging playbook for distributed ML systems.
GenAI
•2026-03-08
Running AI Image Generation Locally on Apple Silicon
I ran three AI image models locally on a Mac Mini M4 Pro. Real benchmarks, real failures, and a routing guide for when to use Klein, Dev, or Qwen.

Distributed Systems
•2026-03-06
When Nodes Die: Ray's Fault Tolerance
How Ray detects failures, retries tasks, recovers actors, and reconstructs lost objects—plus production patterns for building resilient distributed ML pipelines.
