data:image/s3,"s3://crabby-images/364a9/364a93cc5e239f4c8016444a7b5f54ff01bc34a8" alt="Dx11 driver update"
In Direct3D 11, all work submission is done via the immediate context, which represents a single stream of commands that go to the GPU. In essence it’s a return to static scheduling. So NVIDIA has replaced Fermi’s complex scheduler with a far simpler scheduler that still uses scoreboarding and other methods for inter-warp scheduling, but moves the scheduling of instructions in a warp into NVIDIA’s compiler. In particular, since Kepler’s math pipeline has a fixed latency, hardware scheduling of the instruction inside of a warp was redundant since the compiler already knew the latency of each math instruction it issued. However based on their own internal research and simulations, in their search for efficiency NVIDIA found that hardware scheduling was consuming a fair bit of power and area for few benefits. This in turn improves the performance of the processor. Hardware instruction scheduling allows the processor to schedule instructions in the most efficient manner in real time as conditions permit, as opposed to strictly following the order of the code itself regardless of the code’s efficiency. Traditionally, processors have started with static scheduling and then moved to hardware scheduling as both software and hardware complexity has increased. With GK104 NVIDIA is going back to static scheduling. The end result is an interesting one, if only because by conventional standards it’s going in reverse. However, games that are multi threaded so that draw calls are dedicated to 1 core while game logic is spread across the other cores results in the possibility of AMD performance pulling ahead of Nvidia performance with similar level GPUs like 480 vs 1060 due to Nvidia's software scheduler incurring a CPU overhead hit across multiple cores to split draw calls.
data:image/s3,"s3://crabby-images/b8756/b87562a09247ed6944bd1aa5948540887bf70a19" alt="dx11 driver update dx11 driver update"
The result is that in games that heavily place game logic + draw calls on a single thread, AMD performance will suffer while Nvidia performance will not.
data:image/s3,"s3://crabby-images/120e8/120e8e804189c3c433a7db9a6990eff703dfef65" alt="dx11 driver update dx11 driver update"
This incurs a higher CPU overhead hit across all cores but oftentimes results in improved performance due to not running into a single threaded bottleneck.ĪMD GCN architecture uses a hardware scheduler and can not take a game's single threaded DX11 draw calls and split them across multiple cores. Nvidia's DX11 driver uses a software schedule to take a game's single threaded DX11 draw calls and splits the draw calls across multiple cores.
data:image/s3,"s3://crabby-images/364a9/364a93cc5e239f4c8016444a7b5f54ff01bc34a8" alt="Dx11 driver update"