Technical writing
Technical articles on AI engineering
Code-first technical articles where the implementation, assumptions, metrics, and limitations stay visible. Conceptual tutorials will appear here as they are published.
Latest
What Actually Speeds Up Transformer Inference?
Profiling and optimizing a small autoregressive transformer with JAX, KV caching, batching, graph compilation, and low-bit inference.
JAX Inference KV cache Profiling
Read article