OGRE-Next
3.0.0
Object-Oriented Graphics Rendering Engine
|
Ogre 2.3 introduced the concept of root layouts, and they need to be setup for Vulkan.
If you're familiar with D3D12, you may have noted we borrowed this concept.
Older APIs like D3D11 and GL (and Metal 1) use a 'table binding' model. It is very simple to understand:
D3D11 offers 128 texture slots, thus textures can be bound to each slot:
And its corresponding shader declaration:
IHVs recommend to not leave gaps because they flush based on min-max. touched slots, e.g. if we have 10 textures and bind the first 9 at slots [0; 8] and the last one at slot 127, the driver will flush all 127 slots instead of just flushing 10.
The same table model is used for other resources: Const buffers, UAV buffers, UAV textures:
This table model is simple to understand, although not all APIs agree on what goes together, e.g. in Metal regular textures and UAV textures share the same table. In D3D11 they have separate tables, thus regularTexture can be bound to slot t1
and uavTexture to slot u1
; while on Metal only one of them can be bound to slot [[ texture( 1 ) ]]
Ogre tries to abstract these differences.
In newer APIs, these tables don't exist. Developers can layout resources in arbitrary ways and then describe the API what is in each offset. A binding can contain ANYTHING: sampler, const buffer, textures, etc
e.g. in pseudo code:
Two shaders that have the same root layout (even if they don't need to use all the resources declared) are said to be compatible.
In the example A & B are not compatible because slot 2 uses a different type. This means their size in bytes could be different, so even though textureD
is in slot 3 in both shaders, their offset in bytes could be completely different.
Thus when switching from A to B we'd need to rebind everything again (or at best rebind slots 2 and 3, not just slot 2)
Sharing root layouts maximize CPU & GPU performance by lowering the amount of switching. Conversely, very big root layouts can hurt GPU performance due to their size and amount of registers consumed.
Ultimately we want to share as much as possible but not make the Root Layout gigantic to just to achieve maximum sharing.
The fact that resource declaration can be so arbitrary gives a lot of power and flexibility but it can be difficult to setup, easy to mess up; and how to approach the problem can be overwhelming.
There are three ways to approach it:
Ogre follows the 3rd approach (except when arrays of textures are used, which use the 2nd approach).
Root Layouts can have up to 4 sets (which is the minimum guaranteed by Vulkan).
It is common practice in Ogre to leave set 0 for resources bound in the traditional way (i.e. like a table model in D3D11 / OpenGL) while set 1 is set to 'baked' where DescriptorSetTexture/Sampler/Texture2/Uav are bound to it.
Another way to look at Ogre's RootLayout is that it basically tells Ogre what resources will the shader use so that we can properly emulate tables and compile the shader using slot locations calculated by us while the shader author can use binding slot index that have the same number as the shader code for other APIs (i.e. D3D11, GL)
An HLSL shader that ONLY declares and uses the following resources:
Would work using the following RootLayout:
That is, explicitly declare that you're using const buffer range [4; 7), tex buffer range [1; 2) etc.
In Vulkan we will automatically generate macros for use in bindings (buffers are uppercase letter, textures lowercase):
Thus a GLSL shader can use it like this:
i.e. declare slots in range [0;4) even though they won't be used?
Yes. But you would be consuming more memory.
Note that if your vertex shader uses slots [0; 3) and pixel shader uses range [4; 7) then BOTH shaders must use a RootLayout that at least declares range [0; 7) so they can be paired together
RootLayouts have a memory vs performance trade off:
That's why low level materials provide prefab RootLayouts, in order to maximize RootLayout reuse while also keeping reasonable memory consumption.
See GpuProgram::setPrefabRootLayout
Shaders can declare their Root Layouts in JSON in comments as long as it starts with ## ROOT LAYOUT BEGIN
and end with ## ROOT LAYOUT END
:
Or it can be declared in C++. This is the preferred method for Hlms shader code to maximize performance.
HlmsUnlit::setupRootLayout
has a simple example:
Non-baked sets are meant to behave very similarly to table models in D3D11 and OpenGL.
Baked sets on the other hand are meant exclusively for binding DescriptorSetTexture
, DescriptorSetSampler
, DescriptorSetTexture2
and DescriptorSetUav
The size of the DescriptorSet* must match exactly the amount of bindings slots in the RootLayout
To ease porting of low level materials (i.e. *.material and *.program scripts), most low level materials don't need to declare Root Layouts because we have prefabs for them:
Prefab | Description |
---|---|
None | Defined in shader source or externally via C++ |
Standard | 4 textures per material, VS and PS only (default) |
High | 8 textures per material, VS and PS only |
Max | 32 textures per material, all shader stages |
The majority of low level materials are fine with Standard, and this allows us to maximize Root Layout reuse.
If you need something different, you can either:
Change the prefab with root_layout standard|high|max|none
in the shader program script definition, or declare the root layout inside the shader code.
When a shader contains arrays of textures or samplers with an array length > 1, e.g.
We need one of the following to specify that these slots are array, and what's their length:
In automatic, arrays are not declared. But the shader will be compiled, reflected, and then compiled again with a patched Root Layout (unless there were no arrays).
On Debug builds, Ogre will always reflect shaders to check the declared arrays in root layouts match the arrays used by the shader.
Automatic is the default behavior for low level materials. For Hlms shaders it's turned off
Automatic can be turned on or off via GpuProgram::setAutoReflectArrayBindingsInRootLayout
or its script counterpart uses_array_bindings
Compute Shaders can turn on automatic mode by setting the uses_array_bindings
property via Hlms, e.g.
Vulkan and OpenGL both use GLSL. However there are a few differences mostly because of the different binding model.
As a result we provide a few abstractions to separate these differences:
Expression | Vulkan | OpenGL |
---|---|---|
vulkan() macro | Anything inside is kept | Anything inside is removed |
vulkan_layout() macro | It is converted to layout() | It is removed |
#version ogre_glsl_ver_xxx | The ogre_glsl_ver_ part is removed and will be translated to #version xxx | Always converted to #version 450 |