The importance of Node Interleaving on AMD compute nodes
Enabling Node Interleaving in the bios can greatly increase performance of a compute node. Node interleaving essentially lets the CPU decide where to put the memory, disabling it means that the user must explicitly tell where in memory to put data so that the associated CPU gets best performance.
An explanation of Node Interleaving can be found here
The end result, a 4-5x performance increase in terms of memory bandwidth.
In our lab we have several 64 core AMD nodes with the following specs:
- Supermicro HBQGL-6F/HBQGL-IF
- Supermicro 1042-LTF SuperServer
Processor | AMD 6274 |
---|---|
Nickname | Interlagos |
Clock (GHz) | 2.2 |
Sockets/Node | 4 |
Cores/Socket | 16 |
NUMA/Socket | 2 |
DP GFlops/Socket | 140.8 |
Memory/Socket | 32 GB |
Bandwidth/Socket | 102.4 GB/s |
DDR3 | 1333 MHz |
L1 cache (excl.) | 16KB |
L2 cache/# cores | 2MB/2 |
L3 cache/# cores | 8MB/8 |
I noticed a a few days ago that one of the nodes was performing horribly compared to the other so I decided to do some digging. I installed AMDAPPSDK on both machines and ran the clpeak benchmark with the following results:
Bad Compute Node:
Good Compute Node:
There is a 4-5x differerence in memory bandwidth! I omitted the Flop rates of both nodes as they were identical. By enabling Node interleaving, the performance increases dramatically.
Bios Configuration
Note that I will be talking about Bios version 2.0 here.
I am going to provide the bios configuration of the faster machine for the CPU and the Memory options
Bios->Advanced->Processor & Clock Options
Bios->Advanced->Advanced Chipset Control -> NorthBridge Configuration
Bios->Advanced->Advanced Chipset Control -> NorthBridge Configuration ->Memory Configuration