Todays Top News
Nvidia integrates Groq's LPUs into LPX racks for AI inference
Nvidia is using Groq's language processing units (LPUs) to accelerate inference performance in its Vera Rubin rack systems. The LPUs will handle low-latency token generation, while Nvidia's GPUs handle compute-intensive prompt processing. This integration enables Nvidia to serve massive trillion-parameter large language models at high token rates.