Discussion about this post

User's avatar
Ravi Srinivasan's avatar

Very nice deep dive. A couple of high level questions: this article makes a case for how in agentic workflows CPU is limiting compared to GPU but perhaps not so in inference workflows. Q1) Seems like this 2026 GTC and the Vera Rubin architecture was all about gearing up for inference- Jensen repeated this message many times. Is the Vera Rubin architecture equally optimized for Agentic workflows? If yes, how is Vera for agentic workflows compare to Blackwell? Q2) Jensen made a good why having racks specifically for CPU, GPU, LPU, Mem etc in Vera speeds up inference. But based on this article the limitation for agentic workflow maybe CPU-GPU communicaton in which case it’s better if both are in same rack (ie Blackwell) rather than in different racks (ie Vera)?

No posts

Ready for more?