THE TECHNOLOGY INTERVIEW
AWS’ Trainium3 chip, a custom-designed accelerator for AI workload
If models become twice as efficient, that effectively doubles available capacity without adding more power. Bet too heavily on efficiency gains and AWS turns customers away. Bet too conservatively and the company wastes money on unused capacity.
Graviton5 and watching customers at scale SAP saw 60 % performance gains migrating from Graviton4 to Graviton5. The improvement came from two architectural changes that AWS only understood by deploying the previous generation at scale and watching how customers used it.
Graviton4 achieved 192 cores using two processors connected by an interconnect.
That design worked for many workloads but introduced latency penalties when processes needed memory from the opposite side of the CPU. Graviton5 consolidates those cores into a single die.
“ We really liked the 192 cores in Graviton4. We didn’ t like the fact that it was on two separate dies,” David says.“ It was great for databases and analytics, but it would be amazing if we could bring it together.”
The cache needed rethinking too. When AWS looked at workload patterns, the data showed that 192 cores were starving for L3 cache. Graviton5 increases L3 cache by 3x, giving each core access to 2.6x more cache than Graviton4 provided.“ When you increase the core count in CPUs, you don’ t have enough cache for the cores,” David explains.
technologymagazine. com 33