Pipeline
As you may have figured out while reading the previous chapter, GPU shaders are not self-sufficient. Almost every useful shader needs to interact with CPU-provided input and output resources, so before a shader can run, CPU-side steering code must first bind input and output resources to the shader’s data interface.
In an attempt to simplify the life of larger applications, GPU APIs like to make this resource binding process very flexible. But there is a tradeoff there. If the Vulkan spec just kept adding more configuration options at resource binding time, it would quickly cause two problems:
- Resource binding calls, which are on the application’s performance-critical code path where GPU shaders can be executed many times in rapid succession, could become more complex and more expensive to process by the GPU driver.
- GPU compilers would know less and less about the resource binding setup in use, and would generate increasingly generic and unoptimized shader code around access to resources.1
To resolve these problems and a few others2, Vulkan has introduced pipelines. A compute pipeline extends an underlying compute shader with some early “layout” information about how we are going to bind resources later on. What is then compiled by the GPU driver and executed by the GPU is not a shader, but a complete pipeline. And as a result, we get GPU code that is more specialized for the input/output resources that we are eventually going to bind to it, and should therefore perform better without any need for run-time recompilation.
In this chapter, we will see what steps are involved in order to turn our previously written Gray-Scott simulation compute shader into a compute pipeline.
From shader to pipeline stage
First of all, vulkano-shaders
generated SPIR-V code from our GLSL, but we must
turn it into a device-specific
ShaderModule
before we do anything with it. This is done using the load()
function that
vulkano-shaders
also generated for us within the gpu::pipeline::shader
module:
let shader_module = shader::load(context.device.clone())?;
We are then allowed to adjust the value of specialization constants within the module, which allows us to provide simulation parameters at JiT compilation time and let the GPU code be specialized for them. But we are not using this Vulkan feature yet, so we will skip this step for now.
After specializing the module’s code, we must designate a function within the
module that will act as an entry point for our compute pipeline. Every function
that takes no parameters and returns no result is an entry point candidate, and
for simple modules with a single entry point, vulkano
provides us with a
little shortcut:
let entry_point = shader_module
.single_entry_point()
.expect("No entry point found");
From this entry point, we can then define a pipeline stage. This is the point where we would be able to adjust the SIMD configuration on GPUs that support several of them, like Nvidia’s post-Volta GPUs and Intel’s integrated GPUs. But we are not doing SIMD configuration fine-tuning in our very first example of Vulkan computing, so we’ll just stick with the defaults:
use vulkano::pipeline::PipelineShaderStageCreateInfo;
let shader_stage = PipelineShaderStageCreateInfo::new(entry_point);
If we were doing 3D graphics, we would need to create more of these pipeline stages: one for vertex processing, one for fragment processing, several more if we’re using optional features like tesselation or hardware ray tracing… but here, we are doing compute pipelines, which only have a single stage, so at this point we are done specifying what code our compute pipeline will run.
Pipeline layout
Letting vulkano
help us
After defining our single compute pipeline stage, we must then tell Vulkan about
the layout of our compute pipeline’s inputs and outputs. This is basically the
information that we specified in our GLSL shader’s I/O interface, extended with
some performance tuning knobs, so vulkano
provides us with a way to infer a
basic default configuration from the shader itself:
use vulkano::pipeline::layout::PipelineDescriptorSetLayoutCreateInfo;
let mut layout_info =
PipelineDescriptorSetLayoutCreateInfo::from_stages([&shader_stage]);
There are three parts to the pipeline layout configuration:
flags
, which are not used for now, but reserved for use by future Vulkan versions.set_layouts
, which lets us configure each of our shader’s descriptor sets.push_constant_ranges
, which let us configure its push constants.
We have not introduced push constants before. They are a way to quickly pass small amounts of information (hardware-dependent, at least 128 bytes) to a pipeline by directly storing it inline within the GPU command that starts the pipeline’s execution. We will not need them in this Gray-Scott reaction simulation because there are no simulation inputs that vary on each simulation step other than the input image, which itself is way too big to fit inside of a push constant.
Therefore, the only thing that we actually need to pay attention to in this
pipeline layout configuration is the set_layouts
descriptor set configuration.
Configuring our input descriptors
This is a Vec
that contains one entry per descriptor set that our shader
refers to (i.e. each distinct set = xxx
number found in the GLSL code). For
each of these descriptor sets, we can adjust some global configuration that
pertains to advanced Vulkan features beyond the scope of this course, and then
we have one configuration per binding (i.e. each distinct binding = yyy
number
used together with this set = xxx
number in GLSL code).
Most of the per-binding configuration, in turn, was filled out by vulkano
with
good default values. But there is one of them that we will want to adjust.
In Vulkan, when using sampled images (which we do in order to have the GPU handle out-of-bounds values for us), it is often the case that we will not need to adjust the image sampling configuration during the lifetime of the application. In that case, it is more efficient to specialize the GPU code for the sampling configuration that we are going to use. Vulkan lets us do this by specifying that configuration at compute pipeline creation time.
To this end, we first create a sampler…
use vulkano::image::sampler::{
BorderColor, Sampler, SamplerAddressMode, SamplerCreateInfo
};
let input_sampler = Sampler::new(
context.device.clone(),
SamplerCreateInfo {
address_mode: [SamplerAddressMode::ClampToBorder; 3],
border_color: BorderColor::FloatOpaqueBlack,
unnormalized_coordinates: true,
..Default::default()
},
)?;
…and then we add it as an immutable sampler to the binding descriptors associated with our simulation shader’s inputs:
let image_bindings = &mut layout_info.set_layouts[IMAGES_SET as usize].bindings;
image_bindings
.get_mut(&IN_U)
.expect("Did not find expected shader input IN_U")
.immutable_samplers = vec![input_sampler.clone()];
image_bindings
.get_mut(&IN_V)
.expect("Did not find expected shader input IN_V")
.immutable_samplers = vec![input_sampler];
Let us quickly go through the sampler configuration that we are using here:
- The
ClampToBorder
address mode ensures that any out-of-bounds read from this sampler will return a color specified byborder_color
. - The
FloatOpaqueBlack
border color specifies the floating-point RGBA color[0.0, 0.0, 0.0, 1.0]
. We will only be using the first color component, so theFloatTransparentBlack
alternative would be equally appropriate here. - The
unnormalized_coordinates
parameter lets us index the sampled texture by pixel index, rather than by a normalized coordinate from 0.0 to 1.0 that we would need to derive from the pixel index in our case.
The rest of the default sampler configuration works for us.
Building the compute pipeline
With that, we are done configuring our descriptors, so we can finalize our descriptor set layouts…
let layout_info = layout_info
.into_pipeline_layout_create_info(context.device.clone())?;
…create our compute pipeline layout…
use vulkano::pipeline::layout::PipelineLayout;
let layout = PipelineLayout::new(context.device.clone(), layout_info)?;
…and combine that with the shader stage that we have created in the previous section and the compilation cache that we have created earlier in order to build our simulation pipeline.
use vulkano::pipeline::compute::{ComputePipeline, ComputePipelineCreateInfo};
let pipeline = ComputePipeline::new(
context.device.clone(),
Some(context.pipeline_cache.cache.clone()),
ComputePipelineCreateInfo::stage_layout(shader_stage, layout),
)?;
Exercise
Integrate all of the above into the Rust project’s gpu::pipeline
module, as a
new create_pipeline()
function, then make sure that the code still compiles. A
test run is not useful here as runtime behavior remains unchanged for now.
There is an alternative to this, widely used by OpenGL drivers, which is to recompile more specialized shader code at the time where resources are bound. But since resources are bound right before a shader is run, this recompilation work can result in a marked execution delay on shader runs for which the specialized code has not been compiled yet. In real-time graphics, delaying frame rendering work like this is highly undesirable, as it can result in dropped frames and janky on-screen movement.
…like the problem of combining separately defined vertex, geometry, tesselation and fragment shaders in traditional triangle rasterization pipelines.