As seen at GDC 2024: Advanced Graphics Summit

If you were in the audience at our Advanced Graphics Summit session on work graphs at GDC 2024, you will have already watched us announce it – draw calls as part of work graphs. If you couldn’t make it, or just want to hear more – you’re in the right place.

For many years, game developers have dreamt of a fully GPU-driven renderer where the whole scene processing takes place on the GPU. Today, some game engines have made impressive strides in moving scene processing to the GPU, but they’re still hampered by the programming model which prevents them from getting the full GPU rendering nirvana. 

For example, they must deal with issues like “empty draw compaction” which take up significant optimization time and ultimately limit performance. With the introduction of work graphs, a good chunk of that problem was solved, allowing complex scene processing and traversal to be handled by work graphs, but the draws were still separate. This resulted in difficulties getting the draw calls hooked up nicely into a work graph system, more so as work graphs allow the creation of many small draw calls and frequent PSO changes.

Today, we’re very pleased to announce that “mesh nodes” are coming to work graphs later this year!

Work Graphs is the result of several years collaboration across Microsoft®, AMD, and other partners. We’ve always known that we would want to extend this capability beyond pure compute to also encompass draw nodes, are delighted to see this prototype already running on real hardware, and look forward to continuing our strong partnership as we add this functionality to a future version of Direct3D®.

“Mesh nodes” extend work graphs by introducing a new kind of leaf node that drives a mesh shader, and which allows a normal graphics PSO to be referenced from the work graph. And yes, you did read this right – full PSO changing can now be done as well! The feature is called mesh nodes as it allows a work graph to feed directly into a mesh shader, turning the work graph itself into an amplification shader on steroids.

Note: if you’re not entirely sure if mesh shaders are your thing already and you want to learn more about them, we highly recommend you to take a look at our mesh shader blog series here on GPUOpen.

With this new addition, draw calls become an integral part of the work graph and can be processed while the rest of the graph is executing. We’ve been using them to great effect in our procedural enrichment demo which we developed together with our partners at the Coburg University. In this demo, everything but the Skybox and UI is rendered using a single work graph dispatch.

For us, the GPU work graphs API is a major step in graphics programming, especially with the new draw nodes. We wouldn't want to build anything complex without it anymore! We’re looking forward to applying work graphs to many problems in the graphics space.

Watch the demo video now

Some stats from our demo:

  • 6600 draw calls/frame (after coalescing)
  • 13 million triangles/frame
  • 200.000 work items passing through the graph
  • 37 nodes and 9 draw nodes
  • < 200 MiB of work graph backing store memory

“Mesh nodes” really close the loop in terms of providing an end-to-end replacement for Execute Indirect and moving the GPU programming model forward. Everything can move into a single graph and execute in a single dispatch, making it very easy to compose large applications from small bits and pieces. Moreover, problems like PSO switching, empty dispatches, and buffer memory management just go away, making full GPU driven pipelines accessible to many more applications and use cases than before.

In our demo application, we also saw significant performance uplift. When tested in the AMD procedural content demo, ExecuteIndirect is 1.64x slower1 than Work Graphs on average using the mesh nodes extension.

Work graphs vs ExecuteIndirect: Super early numbers. Chart showing ExecuteIndirect at 1.64x compared to Work graphs at 1x. Lower is better.

We’re incredibly excited to see what you’ll be able to do with mesh nodes once they become available later this year!

Make sure you check out our related links below for more work graphs information here on GPUOpen, including our blog series and samples to help get you started.

  1. Testing by AMD as of March 15, 2024, on the AMD Radeon RX 7900 XTX using AMD Software: Adrenalin Edition 31.0.24014.1002 pre-release driver, using the ExecuteIndirect command and Work Graphs with the mesh nodes extension to dispatch scene information to Microsoft® DirectX® 12, on a test system configured with an AMD Ryzen™ 7 5800X CPU, 32GB DDR4 RAM, Gigabyte X570 AORUS ELITE WIFI motherboard, and Windows 11 Pro 2023 Update, using the AMD procedural content Work Graphs demo with the overview, meadow, bridge, wall, and market scene views. System manufacturers may vary configurations, yielding different results. RS-640.