Pipeline

As you may have figured out while reading the previous chapter, GPU shaders are not self-sufficient. Almost every useful shader needs to interact with CPU-provided input and output resources, so before a shader can run, CPU-side steering code must first bind input and output resources to the shader’s data interface.

In an attempt to simplify the life of larger applications, GPU APIs like to make this resource binding process very flexible. But there is a tradeoff there. If the Vulkan spec just kept adding more configuration options at resource binding time, it would quickly cause two problems:

Resource binding calls, which are on the application’s performance-critical code path where GPU shaders can be executed many times in rapid succession, could become more complex and more expensive to process by the GPU driver.
GPU compilers would know less and less about the resource binding setup in use, and would generate increasingly generic and unoptimized shader code around access to resources.¹

To resolve these problems and a few others², Vulkan has introduced pipelines. A compute pipeline extends an underlying compute shader with some early “layout” information about how we are going to bind resources later on. What is then compiled by the GPU driver and executed by the GPU is not a shader, but a complete pipeline. And as a result, we get GPU code that is more specialized for the input/output resources that we are eventually going to bind to it, and should therefore perform better without any need for run-time recompilation.

In this chapter, we will see what steps are involved in order to turn our previously written Gray-Scott simulation compute shader into a compute pipeline.

From shader to pipeline stage

First of all, vulkano-shaders generated SPIR-V code from our GLSL, but we must turn it into a device-specific ShaderModule before we do anything with it. This is done using the load() function that vulkano-shaders also generated for us within the gpu::pipeline::shader module:

let shader_module = shader::load(context.device.clone())?;

We are then allowed to adjust the value of specialization constants within the module, which allows us to provide simulation parameters at JiT compilation time and let the GPU code be specialized for them. But we are not using this Vulkan feature yet, so we will skip this step for now.

After specializing the module’s code, we must designate a function within the module that will act as an entry point for our compute pipeline. Every function that takes no parameters and returns no result is an entry point candidate, and for simple modules with a single entry point, vulkano provides us with a little shortcut:

let entry_point = shader_module
        .single_entry_point()
        .expect("No entry point found");

From this entry point, we can then define a pipeline stage. This is the point where we would be able to adjust the SIMD configuration on GPUs that support several of them, like Nvidia’s post-Volta GPUs and Intel’s integrated GPUs. But we are not doing SIMD configuration fine-tuning in our very first example of Vulkan computing, so we’ll just stick with the defaults:

use vulkano::pipeline::PipelineShaderStageCreateInfo;

let shader_stage = PipelineShaderStageCreateInfo::new(entry_point);

If we were doing 3D graphics, we would need to create more of these pipeline stages: one for vertex processing, one for fragment processing, several more if we’re using optional features like tesselation or hardware ray tracing… but here, we are doing compute pipelines, which only have a single stage, so at this point we are done specifying what code our compute pipeline will run.

Pipeline layout

Letting `vulkano` help us

After defining our single compute pipeline stage, we must then tell Vulkan about the layout of our compute pipeline’s inputs and outputs. This is basically the information that we specified in our GLSL shader’s I/O interface, extended with some performance tuning knobs, so vulkano provides us with a way to infer a basic default configuration from the shader itself:

use vulkano::pipeline::layout::PipelineDescriptorSetLayoutCreateInfo;

let mut layout_info =
    PipelineDescriptorSetLayoutCreateInfo::from_stages([&shader_stage]);

There are three parts to the pipeline layout configuration:

flags, which are not used for now, but reserved for use by future Vulkan versions.
set_layouts, which lets us configure each of our shader’s descriptor sets.
push_constant_ranges, which let us configure its push constants.

We have not introduced push constants before. They are a way to quickly pass small amounts of information (hardware-dependent, at least 128 bytes) to a pipeline by directly storing it inline within the GPU command that starts the pipeline’s execution. We will not need them in this Gray-Scott reaction simulation because there are no simulation inputs that vary on each simulation step other than the input image, which itself is way too big to fit inside of a push constant.

Therefore, the only thing that we actually need to pay attention to in this pipeline layout configuration is the set_layouts descriptor set configuration.

Configuring our input descriptors

This is a Vec that contains one entry per descriptor set that our shader refers to (i.e. each distinct set = xxx number found in the GLSL code). For each of these descriptor sets, we can adjust some global configuration that pertains to advanced Vulkan features beyond the scope of this course, and then we have one configuration per binding (i.e. each distinct binding = yyy number used together with this set = xxx number in GLSL code).

Most of the per-binding configuration, in turn, was filled out by vulkano with good default values. But there is one of them that we will want to adjust.

In Vulkan, when using sampled images (which we do in order to have the GPU handle out-of-bounds values for us), it is often the case that we will not need to adjust the image sampling configuration during the lifetime of the application. In that case, it is more efficient to specialize the GPU code for the sampling configuration that we are going to use. Vulkan lets us do this by specifying that configuration at compute pipeline creation time.

To this end, we first create a sampler…

use vulkano::image::sampler::{
   BorderColor, Sampler, SamplerAddressMode, SamplerCreateInfo
};

let input_sampler = Sampler::new(
    context.device.clone(),
    SamplerCreateInfo {
        address_mode: [SamplerAddressMode::ClampToBorder; 3],
        border_color: BorderColor::FloatOpaqueBlack,
        unnormalized_coordinates: true,
        ..Default::default()
    },
)?;

…and then we add it as an immutable sampler to the binding descriptors associated with our simulation shader’s inputs:

let image_bindings = &mut layout_info.set_layouts[IMAGES_SET as usize].bindings;
image_bindings
    .get_mut(&IN_U)
    .expect("Did not find expected shader input IN_U")
    .immutable_samplers = vec![input_sampler.clone()];
image_bindings
    .get_mut(&IN_V)
    .expect("Did not find expected shader input IN_V")
    .immutable_samplers = vec![input_sampler];

Let us quickly go through the sampler configuration that we are using here:

The ClampToBorder address mode ensures that any out-of-bounds read from this sampler will return a color specified by border_color.
The FloatOpaqueBlack border color specifies the floating-point RGBA color [0.0, 0.0, 0.0, 1.0]. We will only be using the first color component, so the FloatTransparentBlack alternative would be equally appropriate here.
The unnormalized_coordinates parameter lets us index the sampled texture by pixel index, rather than by a normalized coordinate from 0.0 to 1.0 that we would need to derive from the pixel index in our case.

The rest of the default sampler configuration works for us.

Building the compute pipeline

With that, we are done configuring our descriptors, so we can finalize our descriptor set layouts…

let layout_info = layout_info
    .into_pipeline_layout_create_info(context.device.clone())?;

…create our compute pipeline layout…

use vulkano::pipeline::layout::PipelineLayout;

let layout = PipelineLayout::new(context.device.clone(), layout_info)?;

…and combine that with the shader stage that we have created in the previous section and the compilation cache that we have created earlier in order to build our simulation pipeline.

use vulkano::pipeline::compute::{ComputePipeline, ComputePipelineCreateInfo};

let pipeline = ComputePipeline::new(
   context.device.clone(),
   Some(context.pipeline_cache.cache.clone()),
   ComputePipelineCreateInfo::stage_layout(shader_stage, layout),
)?;

Exercise

Integrate all of the above into the Rust project’s gpu::pipeline module, as a new create_pipeline() function, then make sure that the code still compiles. A test run is not useful here as runtime behavior remains unchanged for now.

There is an alternative to this, widely used by OpenGL drivers, which is to recompile more specialized shader code at the time where resources are bound. But since resources are bound right before a shader is run, this recompilation work can result in a marked execution delay on shader runs for which the specialized code has not been compiled yet. In real-time graphics, delaying frame rendering work like this is highly undesirable, as it can result in dropped frames and janky on-screen movement.

…like the problem of combining separately defined vertex, geometry, tesselation and fragment shaders in traditional triangle rasterization pipelines.

Gray-Scott with Rust