The Work Graph Playground is a DirectX12-based C++-application that allows graphics programmers to learn and experiment with the new Work Graphs feature using HLSL shaders. In runs on Windows 10 and Windows 11 systems.
Work Graphs is a Graphics-API feature released in March 2024 for DirectX12. In a nutshell, with Work Graphs, shaders can dynamically schedule new workloads at runtime directly from the GPU. Prior to Work Graphs, all GPU workloads had to be scheduled by the CPU. Therefore, Work Graphs can reduce memory requirements, improve caching behavior, better utilize compute resources, reduce communication-needs between CPU-and-GPU, and simplify synchronization.
From a programmer's perspective, Work Graphs extend the host-side of the API, but for most of the time, programmers deal with the shader-code part introduced with Work Graphs.
We provide several tutorials to walk you through the HLSL-usage of Work Graphs. As most of the power of Work Graphs is unleashed through HLSL, our tutorials focus on the HLSL aspects of Work Graphs. Our Work Graph Playground frees you from dealing with the host-side of Work Graphs. However, you can use our host-side code as a reference for integrating Work Graphs in your own projects.
In each tutorial, we cover one aspect of Work Graphs. The tutorials build upon each other, so we recommend taking them one-by-one. You need just a few prerequisites to run the tutorials.
By the end of the tutorials, if not even already before, you should be inspired to create your own Work Graphs samples and grow from there! If you want to experiment with Work Graphs, check out Adding new tutorials below on how to add your own samples or tutorials to the playground application. Be sure to check out the additional resources and samples linked below, if you wish to learn more about Work Graphs.
As a person taking this tutorial, you need to know HLSL, C++, Direct3D12, and have a basic understand of how GPU compute shaders work.
Besides a computer, you need:
- A text/code editor of your choice
- A windows version that supports the Microsoft Agility SDK
- Optional: Graphics diagnostic tools for debugging.
To run the sample directly, you'll also need a GPU and driver with D3D12 Work Graphs 1.0 support. You can learn more about D3D12 Work Graphs 1.0 and driver availability on Microsoft's blog post or on our own blog post on GPUOpen.com.
If your GPU does not support Work Graphs, you can download and install the DirectX WARP adapter as follows:
- Download the Microsoft.Direct3D.WARP 1.0.13 NuGet package.
BY INSTALLING THE THIRD PARTY SOFTWARE, YOU AGREE TO BE BOUND BY THE LICENSE AGREEMENT(S) APPLICABLE TO SUCH SOFTWARE and you agree to carefully review and abide by the terms and conditions of all license(s) that govern such software.
- Change the file extension to
.zip
, such that the full filename ismicrosoft.direct3d.warp.1.0.13.zip
- Open or extract the zip file and locate the
d3d10warp.dll
inbuild/native/bin/x64
and copy it next to theWorkGraphPlayground.exe
If you're building the application from source, the steps above are automated by the CMake build script.
Download the latest release from here or follow the instructions for building the application from source.
Start the WorkGraphPlayground.exe
either directly or from the command line.
You can pass the following options to WorkGraphPlayground.exe
:
--forceWarpAdapter
uses the WARP adapter, even if your GPU does support Work Graphs. If you're using pre-built binaries, you'll need to download and install the WARP adapter first. See instructions above.--enableDebugLayer
to enable D3D12 Debug Layer (recommended).--enableGpuValidationLayer
to turn on D3D12 GPU validation.
You should see the following application window:
The tutorials only consist of HLSL shader code and are located in the tutorials
folder.
The shader files are automatically reloaded at runtime whenever any changes were detected, meaning you don't have to restart the application whenever you modify the shader source code.
You will see this in action in the first tutorial.
If the shader compilation fails, the previous (successfully) compiled shader code is used. Any error messages or other output from the shader compiler is displayed in the application output log.
We recommend running the app with --enableDebugLayer
command line argument, to also see any further error messages from the D3D12 debug layer. Note that Graphics diagnostic tools must be installed in order to enable the debug layer.
Description: This is a minimal introductory tutorial. You get acquainted with our tutorial playground and two very simple Work Graphs nodes. Your task is to launch one worker node from the entry node and print your name from the worker node. You can directly edit the shader file in an editor of your choice. Upon saving, you experience that our playground will automatically reload and recompile your changes made in the shader file.
Learning Outcome: You get a feeling for our playground application, shader hot reloading, and reassure that our tutorials run on your device. You learn how to
- mark HLSL functions as Work Graphs nodes with
Shader("node")
- add further attributes to shader functions, and
- get a first glimpse on how to invoke other nodes
References to Specification:
- Shader Target
- Shader Function Attributes
- See
EmptyNodeOutput
in Node output declaration
Description: Work Graphs use records to model data-flow. Records serve as inputs and outputs of nodes. In this tutorial, you emit records at a producer node and receive it at different consumer nodes. Your producer node issues multiple consumer nodes that render different things. You parameterize the nodes with records.
Learning Outcome: Besides getting a better understanding of how EmptyNodeOutput
works, you learn how to
- declare non-empty records with
NodeOutput
at a producer node; - use the
MaxRecords
attribute to cap the number of record outputs at compile time; - identify situations when to use
GroupNodeOutputRecords
andThreadNodeOutputRecords
to output records; - output records during runtime with
GroupIncrementOutputCount
;ThreadIncrementOutputCount
, andOutputComplete
; - obtain zero or one output record with
GetThreadNodeOutputRecords
(per thread)GetGroupNodeOutputRecords
(per thread-group); - obtain records at a receiving node with
ThreadNodeInputRecord
; and - read and write data to records with the
Get()
-method or the[]
-operator.
References to Specification:
- Objects
- Node output declaration
- Node input declaration
- Methods operating on node output
- Record access
Description: You can launch Work Graphs nodes in three different ways: "thread", "coalescing", and "broadcast". From an entry node, you will launch three different nodes, each using a different launch mode.
Learning Outcome: You will learn
- how and when to use
NodeMaxDispatchGrid
and what the difference toNodeDispatchGrid
is; - how to use
DispatchNodeInputRecord
as input for nodes launched withbroadcast
; - how to access individual threads with
SV_DispatchThreadID
for nodes launched inbroadcast
; and - how to obtain input for nodes launched in
coalescing
launch mode throughGroupNodeInputRecords
.
References to Specification:
Description: The "Classify-and-Execute" pattern is commonly found in graphics. With "Classify-and-Execute", you first determine the class of a work item, and depending the classification result, you execute different shaders. In this tutorial, we use a basic shading example: As work item, we use a pixel that covers a ray-surface intersection. First, a work graph node "classifies" the work-items, i.e., the pixel, into three different material classes. Finally, we "execute" one of three Work-Graph nodes, depending the classification result. We use the Work-Graphs concept Node Arrays to elegantly solve this "Classify-and-Execute" problem.
Learning Outcome: You will learn how to
- use
NodeOutputArray
to declare that output records are in fact node arrays; - set the maximum number of output classes that
NodeOutputArray
emits withNodeArraySize
; - obtain the classified record
ThreadNodeOutputRecords
withGetThreadNodeOutputRecords
; - prepare different Nodes and their HLSL functions for use with Node Arrays by extending
NodeId
with an index; and - properly use
ThreadNodeInputRecord
for NodeArrays to obtain the input.
References to Specification:
- Output record objects
- Objects
- Node input declaration
- Node output declaration
GetThreadNodeOutputRecords
NodeId
Description: Nodes can issue records not only for other nodes, but also for themselves. This is called Work Graph Recursion. It supports trivial cycles, i.e., node A can issue work for node A again. However, one limitation is though, that a node A cannot issue work to nodes from which A has already received records from, even transitively. That means, non-trivial cycles are also disallowed. Also the recursion depth is limited. However, fractals are a great example to try out trivial cycles for self-recursion! You see how Work Graphs compute a simple fractal, the Koch Snowflake. You get to compute a second fractal, see Menger Sponge.
Learning Outcome: You learn how to implement trivial cycles, by
- Configure the maximum recursion depth for trivial cycles using
NodeMaxRecursionDepth
; - use
GetRemainingRecursionLevels
to terminate recursion; and - recursively call the calling node.
References to Specification:
Description: When you want that different thread-groups of a broadcasting node communicate amongst each other, Work Graphs offer an input record type RWDispatchNodeInputRecord
which is also writeable.
Previously, we've seen DispatchNodeInputRecord
which is only readable.
We'll use an RWDispatchNodeInputRecord
to store the bounding-box of an object that many threads within a broadcasting node compute cooperatively.
One thread then gets to draw the bounding box.
Learning Outcome: You learn
- to use
RWDispatchNodeInputRecord
such that you can read- and write input records; - how to, when to, and why to apply the
globallycoherent
attribute onRWDispatchNodeInputRecord
records; - to read and write
RWDispatchNodeInputRecord
s with atomic operations at the example of InterlockedMin and InterlockedMax; - to prepare record-structs with the
NodeTrackRWInputSharing
-attribute for the usage ofFinishedCrossGroupSharing
; and - how to synchronize the input record across all the thread-groups of a broadcast launch with
Barrier
andFinishedCrossGroupSharing
.
References to Specification:
- Shaders can use input record lifetime for scoped scratch storage
- Input Record objects
- Node input declaration
- Node input attributes
- Record struct
FinishedCrossGroupSharing
- Objects
- Barrier
Description:
Another pattern commonly found in computer graphics is recursive subdivision of a geometric primitive. Among the countless examples, we picked computing a Mandelbrot set. We provide the algorithmic part of that in the header-file Mandlebrot.h
and a "brute-force" solution. However, a simple property of Mandelbrot sets gives raise to a subdivision algorithm that optimizes the computation of the Mandelbrot set. Use Work Graphs to exploit this property and make the algorithm more efficient.
Learning Outcome: In this final tutorial, we would like to see you try out your Work Graphs expertise. Your learning outcome should be that your able to solve a common graphics problem with Work Graphs. From there on, you should become able to assess for what tasks Work Graphs are a fit for you.
References to Specification:
- (D3D12 Work Graphs)[https://microsoft.github.io/DirectX-Specs/d3d/WorkGraphs.html]
To add a new tutorial, create a new folder inside the tutorials
folder.
The position of your tutorial in the tutorial list in the application UI will be based on this folder name.
Inside this folder, create a new .hlsl
file. The filename in camel-case will be used as a name for the tutorial (e.g., MyNewTutorial.hlsl
will result in My New Tutorial
).
If you wish to provide a sample solution, create a second .hlsl
file with the suffix Solution
(e.g., MyNewTutorialSolution.hlsl
).
If your tutorial requires more than one file, you can always create additional .h
header files and include them in your tutorial files.
Restart the application to see your tutorial appear in the tutorial list.
Each tutorial must define a node named Entry
with no input record.
This node will be invoked once per frame.
The Common.h
header file provides access to shader resources (output render target & scratch buffers) and utility methods for drawing text or primitives (lines & rectangles).
Shaders are compiled using the Microsoft DirectX shader compiler with the following arguments:
-T lib_6_8 -enable-16bit-types -HV 2021 -Zpc -I./tutorials/
See ShaderCompiler.cpp for more details.
Here, we show how you can directly build our Work Graph Playground from source.
- CMake 3.17
- Visual Studio 2019
- Windows 10 SDK 10.0.18362.0
- A windows version that supports the Microsoft Agility SDK
Clone the repository, including ImGui submodule:
git clone https://github.com/GPUOpen-LibrariesAndSDKs/WorkGraphPlayground.git --recurse-submodules
Configuring with CMake:
cmake -B build .
This command will download the following NuGet packages:
In order to use this software, you may need to have certain third party software installed on your system. Whether you install this software directly or whether a script is provided that, when executed by you, automatically fetches and installs software onto your system BY INSTALLING THE THIRD PARTY SOFTWARE, YOU AGREE TO BE BOUND BY THE LICENSE AGREEMENT(S) APPLICABLE TO SUCH SOFTWARE and you agree to carefully review and abide by the terms and conditions of all license(s) that govern such software. You acknowledge and agree that AMD is not distributing to you any of such software and that you are solely responsible for the installation of such software on your system.
Opening VS Solution:
cmake --open build
In Visual Studio, build and run the Work Graph Playground
project.
See adding new tutorials to add new tutorials. Re-run cmake -B build .
to add any new files to the Visual Studio solution.
While Work Graphs is a new feature, there are already some resources available.
Work Graphs General:
- GPU Work Graphs in Microsoft DirectX® 12
- Work graphs API – compute rasterizer learning sample
- GDC 2024 - GPU Work Graphs: Welcome to the Future of GPU Programming
- HPG 2024 - Work Graphs: Hands-On with the Future of Graphics Programming
Mesh Nodes:
- GDC 2024 - Work Graphs and draw calls – a match made in heaven!
- GPU Work Graphs mesh nodes in Microsoft DirectX® 12
- HPG 2024 - Real-Time Procedural Generation with GPU Work Graphs
Work Graphs Samples: