Vik's Newsletter
Subscribe
Sign in
Home
Podcast
Notes
Chat
Semi Doped Podcast
Archive
About
Computing
Inside SambaNova's Inference Architecture
How the Three-Tier Memory Hierarchy and the Reconfigurable Dataflow Unit Compare to Rubin, Groq, and Cerebras
May 20
•
Vikram Sekar
33
5
Cerebras IPO and the Four Bottlenecks in Its Custom-Everything Architecture
A closer look at the wafer-scale architecture, the OpenAI deal, and four design vectors that decide if Cerebras can deploy at scale
May 12
•
Vikram Sekar
61
5
4
Tokenmaxxing and the Token Value Chain
Who profits from tokenmaxxing and who gets stuck with the bill.
Apr 14
•
Vikram Sekar
35
4
TurboQuant: Inner Workings and Implications
Google's year old research resurfaced in a recent blog post and spooked memory investors.
Mar 26
•
Vikram Sekar
38
2
3
Beyond GTC: A Deep Dive into Compute, LPX, and the Untold Story of SpecDec
Analyzing the CPU, GPU, and LPU chip ratios unveiled at the Nvidia GTC keynote, the impact of the Groq LPX chip on disaggregated decoding, and its…
Mar 18
•
Vikram Sekar
35
6
GTC 2026 Preview | Implications of Nvidia's SRAM-Decode Hardware on the Inference Market
The case for dedicated decode hardware and what it means for AMD, HBM, and the SRAM startup market.
Mar 4
•
Vikram Sekar
40
8
6
The AI Datacenter CPU Yellow Pages
Grace, Vera, Venice, Turin, Diamond/Granite Rapids, Clearwater/Sierra Forest, Graviton, Cobalt, Phoenix, AmpereOne.
Feb 24
•
Vikram Sekar
38
2
The CPU Bottleneck in Agentic AI and Why Server CPUs Matter More Than Ever
How agent orchestration shifts the CPU-GPU ratio, a framework for scoring server CPUs across reasoning and action workloads, applied to 17 datacenter…
Feb 17
•
Vikram Sekar
108
1
10
How d-Matrix's In-Memory Compute Tackles AI Inference Economics
A deep look into the architecture from chip construction to rack-scale deployments, performance metrics, and end applications.
Dec 9, 2025
•
Vikram Sekar
22
3
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts