Memory barrier not working?

Hi! I’m developping my own game engine and I’ve a problem, I have one per pixel linked list per cubemap face.

I’m using a 3D image to store head pointers, and I use multiview to draw to each cubemap faces.

So I create my 3D image like this :

VkImageCreateInfo imageInfo{};
                imageInfo.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO;
                imageInfo.imageType = VK_IMAGE_TYPE_3D;
                imageInfo.extent.width = static_cast<uint32_t>(window.getView().getSize().x);
                imageInfo.extent.height = static_cast<uint32_t>(window.getView().getSize().y);
                imageInfo.extent.depth = 6;
                imageInfo.mipLevels = 1;
                imageInfo.arrayLayers = 1;
                imageInfo.format = VK_FORMAT_R32_UINT;
                imageInfo.tiling = VK_IMAGE_TILING_OPTIMAL;
                imageInfo.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
                imageInfo.usage = VK_IMAGE_USAGE_TRANSFER_DST_BIT | VK_IMAGE_USAGE_SAMPLED_BIT | VK_IMAGE_USAGE_STORAGE_BIT;
                imageInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
                imageInfo.samples = VK_SAMPLE_COUNT_1_BIT;
                imageInfo.flags = 0; // Optionnel
                if (vkCreateImage(window.getDevice().getDevice(), &imageInfo, nullptr, &headPtrTextureImage) != VK_SUCCESS) {
                    throw std::runtime_error("echec de la creation d'une image!");
                }

                VkMemoryRequirements memRequirements;
                vkGetImageMemoryRequirements(window.getDevice().getDevice(), headPtrTextureImage, &memRequirements);

                VkMemoryAllocateInfo allocInfo{};
                allocInfo.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
                allocInfo.allocationSize = memRequirements.size;
                allocInfo.memoryTypeIndex = findMemoryType(memRequirements.memoryTypeBits, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT);


                if (vkAllocateMemory(window.getDevice().getDevice(), &allocInfo, nullptr, &headPtrTextureImageMemory) != VK_SUCCESS) {
                    throw std::runtime_error("echec de l'allocation de la memoire d'une image!");
                }
                vkBindImageMemory(window.getDevice().getDevice(), headPtrTextureImage, headPtrTextureImageMemory, 0);
                transitionImageLayout(headPtrTextureImage, VK_FORMAT_R32_UINT, VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_GENERAL);

Then I create the image view :

VkImageViewCreateInfo viewInfo{};
                viewInfo.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO;
                viewInfo.image = headPtrTextureImage;
                viewInfo.viewType = VK_IMAGE_VIEW_TYPE_3D;
                viewInfo.format = VK_FORMAT_R32_UINT;
                viewInfo.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
                viewInfo.subresourceRange.baseMipLevel = 0;
                viewInfo.subresourceRange.levelCount = 1;
                viewInfo.subresourceRange.baseArrayLayer = 0;
                viewInfo.subresourceRange.layerCount = 1;
                if (vkCreateImageView(vkDevice.getDevice(), &viewInfo, nullptr, &headPtrTextureImageView) != VK_SUCCESS) {
                    throw std::runtime_error("failed to create head ptr texture image view!");
                }

I specified r32ui format for atomic operations.

But there is a problem in this fragment shader :

const std::string fragmentShader = R"(#version 460
                                                      #extension GL_EXT_nonuniform_qualifier : enable
                                                      #extension GL_EXT_debug_printf : enable
                                                      struct NodeType {
                                                          vec4 color;
                                                          float depth;
                                                          uint next;
                                                      };
                                                      layout(set = 0, binding = 0) buffer CounterSSBO {
                                                          uint count[6];
                                                          uint maxNodes;
                                                      };
                                                      layout(set = 0, binding = 1, r32ui) uniform coherent uimage3D headPointers;
                                                      layout(set = 0, binding = 2) buffer linkedLists {
                                                          NodeType nodes[];
                                                      };
                                                      layout(set = 0, binding = 3) uniform sampler2D textures[];
                                                      layout (location = 0) in vec4 frontColor;
                                                      layout (location = 1) in vec2 fTexCoords;
                                                      layout (location = 2) in flat uint texIndex;
                                                      layout (location = 3) in vec3 normal;
                                                      layout (location = 4) in flat int viewIndex;
                                                      layout(location = 0) out vec4 fcolor;
                                                      void main() {
                                                           uint nodeIdx = atomicAdd(count[viewIndex], 1);
                                                           vec4 texel = (texIndex != 0) ? frontColor * texture(textures[texIndex-1], fTexCoords.xy) : frontColor;
                                                           if (nodeIdx < maxNodes) {
                                                                uint prevHead = imageAtomicExchange(headPointers, ivec3(gl_FragCoord.xy, viewIndex), nodeIdx);
                                                                nodes[nodeIdx+viewIndex*maxNodes].color = texel;
                                                                nodes[nodeIdx+viewIndex*maxNodes].depth = gl_FragCoord.z;
                                                                nodes[nodeIdx+viewIndex*maxNodes].next = prevHead;
                                                                debugPrintfEXT("prev head : %i, node Idx : %i, view index : %i\n", prevHead, nodeIdx, viewIndex);

                                                           }
                                                           fcolor = vec4(0, 0, 0, 0);
                                                      })";

The imageAtomicExchange set always 0 to my image, si the returned value is always -1 or 0 which is incorrect.

Thanks.

EDIT : does nvidia drivers support atomic operations on 3D images ?

This works with opengl so it should works on vulkan…

Hi @laurentduroisin,

I cannot help you with this issue, but I think moving your post to the Vulkan section does give it a better chance to be seen. Our Vulkan experts regularly check in here and will hopefully be able to assist.

Thanks!

Ok how can I move this post ?

I proactively did that for you.

Hi! Ok thanks. Strange issue it seems the values are not set to my 3D image and to my SSBO.

Or they are updated after debugPrintfEXT print them :

const std::string fragmentShader = R"(#version 460
                                                      #extension GL_EXT_nonuniform_qualifier : enable
                                                      #extension GL_EXT_debug_printf : enable
                                                      struct NodeType {
                                                          vec4 color;
                                                          float depth;
                                                          uint next;
                                                      };
                                                      layout(set = 0, binding = 0) buffer CounterSSBO {
                                                          uint count[6];
                                                          uint maxNodes;
                                                      };
                                                      layout(set = 0, binding = 1, r32ui) uniform coherent uimage3D headPointers;
                                                      layout(set = 0, binding = 2) buffer linkedLists {
                                                          NodeType nodes[];
                                                      };
                                                      layout(set = 0, binding = 3) uniform sampler2D textures[];
                                                      layout (location = 0) in vec4 frontColor;
                                                      layout (location = 1) in vec2 fTexCoords;
                                                      layout (location = 2) in flat uint texIndex;
                                                      layout (location = 3) in vec3 normal;
                                                      layout (location = 4) in flat int viewIndex;
                                                      layout(location = 0) out vec4 fcolor;
                                                      void main() {
                                                           uint nodeIdx = atomicAdd(count[viewIndex], 1);
                                                           vec4 texel = (texIndex != 0) ? frontColor * texture(textures[texIndex-1], fTexCoords.xy) : frontColor;
                                                           if (nodeIdx < maxNodes) {
                                                                uint prevHead = imageAtomicExchange(headPointers, ivec3(gl_FragCoord.xy, viewIndex), nodeIdx);
                                                                nodes[nodeIdx+viewIndex*maxNodes].color = texel;
                                                                nodes[nodeIdx+viewIndex*maxNodes].depth = gl_FragCoord.z;
                                                                nodes[nodeIdx+viewIndex*maxNodes].next = prevHead;
                                                                //debugPrintfEXT("prev head : %i, next : %i, node Idx : %i, view index : %i\n", prevHead, nodes[nodeIdx+viewIndex*maxNodes].next, nodeIdx, viewIndex);
                                                           }
                                                           fcolor = vec4(0, 0, 0, 0);
                                                      })";

nodes[nodeIdx+viewIndex*maxNodes].next display value 0 or prevHead value is -1…

Ok I’ve found the issue now it’s working. I set the wrong maxnodes and the buffer size of my linked list was not good in the discriptor.

1 Like

At some point you really need to show us what you are working on! Your engagement and perseverance in bug hunting is amazing and makes me really curious.

Thanks I’m trying to port my engine in vulkan, here is the working directory :

https://github.com/LaurentDuroisin7601/ODFAEG/blob/master/ODFAEG

Debuging is something I really like to do.

But when I run the demo, my environment map is wrong, it seems there is a bug here :

environmentMap.clear(sf::Color::Transparent);
                                    VkClearColorValue clearColor;
                                    clearColor.uint32[0] = 0xffffffff;
                                    std::vector<VkCommandBuffer> commandBuffers = environmentMap.getCommandBuffers();
                                    for (unsigned int i = 0; i < commandBuffers.size(); i++) {
                                        VkImageSubresourceRange subresRange = {};
                                        subresRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
                                        subresRange.levelCount = 1;
                                        subresRange.layerCount = 1;
                                        vkCmdClearColorImage(commandBuffers[i], headPtrTextureImage, VK_IMAGE_LAYOUT_GENERAL, &clearColor, 1, &subresRange);
                                        for (unsigned int j = 0; j < 6; j++) {
                                            vkCmdFillBuffer(commandBuffers[i], counterShaderStorageBuffers[i], j*sizeof(uint32_t), sizeof(uint32_t), 0);
                                        }
                                        VkMemoryBarrier memoryBarrier;
                                        memoryBarrier.sType = VK_STRUCTURE_TYPE_MEMORY_BARRIER;
                                        memoryBarrier.pNext = VK_NULL_HANDLE;
                                        memoryBarrier.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
                                        memoryBarrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_SHADER_WRITE_BIT;
                                        vkCmdPipelineBarrier(commandBuffers[i], VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, 0, 1, &memoryBarrier, 0, nullptr, 0, nullptr);
                                    }
                                    environmentMap.display();

I’ve a lot of 0 on my headPtrTexture in the shader, it seems when the shader access the 3D image not all values are set to 0xffffffff even with the memory barrier…, it seems the memory barrier doesn’t works…

Maybe it’s because I do clearing and drawing with differents submits ? Should I use VkEvent instead ?

Edit : no not working, microsoft copilot says that : if the layout of the image is VK_IMAGE_LAYOUT_GENERAL it can lead to incoherent read…

EDIT 2 : I tried several things like changing the image layout for dst transfer optimal and setting arrayCount of subresource range to 6 but that doesn’t solve the issue.

Ok I need to post a minimal code to report the bug…, but the problem is there is a lot of code to set with vulkan to draw somethings…

Hi! I’ve made a simpler code which reproduce the bug, it seems there is a driver bug.
When executing this code, we can see that debugPrintfEXT print 0 for the image value, or it shouldn’t because I clear the image with the value 0xffffffff (-1) and I don’t write to the image…

I tried to make a code so minimal as possible :

#include "application.h"
#include "odfaeg/Graphics/graphics.hpp"
#include "odfaeg/Audio/audio.hpp"
#include "odfaeg/Math/math.hpp"
#include "hero.hpp"


using namespace odfaeg::core;
using namespace odfaeg::math;
using namespace odfaeg::physic;
using namespace odfaeg::graphic;
using namespace odfaeg::window;
using namespace odfaeg::audio;
using namespace sorrok;
#include "odfaeg/Graphics/renderWindow.h"
#include "odfaeg/Graphics/font.h"
#include "odfaeg/Graphics/text.h"
#include "odfaeg/Graphics/sprite.h"
#include "odfaeg/Window/iEvent.hpp"
#include <glm/glm.hpp>
#include <glm/gtc/matrix_transform.hpp>
#include <glm/gtc/type_ptr.hpp>
struct UniformBufferObject {
    Matrix4f model;
    Matrix4f view;
    Matrix4f proj;
};
void compileShaders(Shader& sLinkedList) {
    const std::string linkedListIndirectRenderingVertexShader = R"(#version 460
                                                               #extension GL_EXT_multiview : enable
                                                               #extension GL_EXT_debug_printf : enable
                                                               layout (location = 0) in vec3 position;
                                                               layout (location = 1) in vec4 color;
                                                               layout (location = 2) in vec2 texCoords;
                                                               layout (location = 3) in vec3 normals;
                                                               layout(binding = 0) uniform UniformBufferObject {
                                                                    mat4 model;
                                                                    mat4 view;
                                                                    mat4 proj;
                                                               } ubo;
                                                               layout (location = 0) out vec4 frontColor;
                                                               layout (location = 1) out vec2 fTexCoords;
                                                               layout (location = 2) out vec3 normal;
                                                               layout (location = 3) out flat int viewIndex;
                                                               void main() {
                                                                    gl_Position = (vec4(position, 1.f) * ubo.model * ubo.view * ubo.proj);
                                                                    gl_Position.z = 0;
                                                                    fTexCoords = texCoords;
                                                                    frontColor = color;
                                                                    normal = normals;
                                                                    viewIndex = gl_ViewIndex;
                                                               }
                                                               )";

    const std::string fragmentShader = R"(#version 460
                                          #extension GL_EXT_nonuniform_qualifier : enable
                                          #extension GL_EXT_debug_printf : enable
                                          layout(set = 0, binding = 1, r32ui) uniform coherent uimage3D headPointers;

                                          layout (location = 0) in vec4 frontColor;
                                          layout (location = 1) in vec2 fTexCoords;
                                          layout (location = 2) in vec3 normal;
                                          layout (location = 3) in flat int viewIndex;
                                          layout(location = 0) out vec4 fcolor;
                                          void main() {
                                               uint prevHead = imageLoad(headPointers, ivec3(gl_FragCoord.xy, viewIndex)).r;
                                               if (prevHead == 0)
                                                    debugPrintfEXT("prevHead: %i\n", prevHead);
                                               fcolor = frontColor;
                                          })";
    if (!sLinkedList.loadFromMemory(linkedListIndirectRenderingVertexShader, fragmentShader)) {
        throw Erreur(58, "Error, failed to load per pixel linked list shader", 3);
    }
}
void createDescriptorPoolLinkedList(Device& vkDevice, Shader& shader, RenderTarget& target) {
    std::vector<VkDescriptorPool>& descriptorPool = target.getDescriptorPool();
    descriptorPool.resize(Shader::getNbShaders()*RenderTarget::getNbRenderTargets());
    std::array<VkDescriptorPoolSize, 2> poolSizes;
    poolSizes[0].type = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
    poolSizes[0].descriptorCount = 1;
    poolSizes[1].type = VK_DESCRIPTOR_TYPE_STORAGE_IMAGE;
    poolSizes[1].descriptorCount = 1;
    VkDescriptorPoolCreateInfo poolInfo{};
    poolInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO;
    poolInfo.poolSizeCount = static_cast<uint32_t>(poolSizes.size());
    poolInfo.pPoolSizes = poolSizes.data();
    poolInfo.maxSets = 1;
    unsigned int descriptorId = target.getId() * Shader::getNbShaders() + shader.getId();
    if (vkCreateDescriptorPool(vkDevice.getDevice(), &poolInfo, nullptr, &descriptorPool[descriptorId]) != VK_SUCCESS) {
        throw std::runtime_error("echec de la creation de la pool de descripteurs!");
    }
}
void createDescriptorLayoutLinkedList(Device& vkDevice,Shader& shader, RenderTarget& target) {
    std::vector<VkDescriptorSetLayout>& descriptorSetLayout = target.getDescriptorSetLayout();
    descriptorSetLayout.resize(Shader::getNbShaders()*RenderTarget::getNbRenderTargets());
    VkDescriptorSetLayoutBinding uniformBufferLayoutBinding;
    uniformBufferLayoutBinding.binding = 0;
    uniformBufferLayoutBinding.descriptorCount = 1;
    uniformBufferLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
    uniformBufferLayoutBinding.pImmutableSamplers = nullptr;
    uniformBufferLayoutBinding.stageFlags = VK_SHADER_STAGE_VERTEX_BIT;

    VkDescriptorSetLayoutBinding headPtrImageLayoutBinding;
    headPtrImageLayoutBinding.binding = 1;
    headPtrImageLayoutBinding.descriptorCount = 1;
    headPtrImageLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_IMAGE;
    headPtrImageLayoutBinding.pImmutableSamplers = nullptr;
    headPtrImageLayoutBinding.stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT;

    std::array<VkDescriptorSetLayoutBinding, 2> bindings = {headPtrImageLayoutBinding, uniformBufferLayoutBinding};
    VkDescriptorSetLayoutCreateInfo layoutInfo{};
    layoutInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
    //layoutInfo.flags = VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR;
    layoutInfo.bindingCount = static_cast<uint32_t>(bindings.size());;
    layoutInfo.pBindings = bindings.data();
    unsigned int descriptorId = target.getId() * Shader::getNbShaders() + shader.getId();
    if (vkCreateDescriptorSetLayout(vkDevice.getDevice(), &layoutInfo, nullptr, &descriptorSetLayout[descriptorId]) != VK_SUCCESS) {
        throw std::runtime_error("failed to create descriptor set layout!");
    }
}
void allocateDescriptorSets(Device& vkDevice, Shader& shader, RenderTarget &target) {
    std::vector<std::vector<VkDescriptorSet>>& descriptorSets = target.getDescriptorSet();
    std::vector<VkDescriptorSetLayout>& descriptorSetLayout = target.getDescriptorSetLayout();
    std::vector<VkDescriptorPool>& descriptorPool = target.getDescriptorPool();
    descriptorSets.resize(Shader::getNbShaders()*RenderTarget::getNbRenderTargets());
    unsigned int descriptorId = target.getId() * Shader::getNbShaders() + shader.getId();
    for (unsigned int i = 0; i < descriptorSets.size(); i++) {
        descriptorSets[i].resize(1);
    }
    std::vector<VkDescriptorSetLayout> layouts(1, descriptorSetLayout[descriptorId]);
    VkDescriptorSetAllocateInfo allocInfo{};
    allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
    allocInfo.descriptorPool = descriptorPool[descriptorId];
    allocInfo.descriptorSetCount = 1;
    allocInfo.pSetLayouts = layouts.data();
    if (vkAllocateDescriptorSets(vkDevice.getDevice(), &allocInfo, descriptorSets[descriptorId].data()) != VK_SUCCESS) {
        throw std::runtime_error("echec de l'allocation d'un set de descripteurs!");
    }
}
void createDescriptorSets(Device& vkDevice, Shader& shader, RenderTarget& target, VkImage& headPtrTextureImage, VkImageView& headPtrTextureImageView, VkSampler& headPtrTextureSampler, VkBuffer ubo) {
    unsigned int descriptorId = target.getId() * Shader::getNbShaders() + shader.getId();
    std::vector<std::vector<VkDescriptorSet>>& descriptorSets = target.getDescriptorSet();
    std::array<VkWriteDescriptorSet, 2> descriptorWrites{};

    VkDescriptorBufferInfo bufferInfo{};
    bufferInfo.buffer = ubo;
    bufferInfo.offset = 0;
    bufferInfo.range = sizeof(UniformBufferObject);

    descriptorWrites[0].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
    descriptorWrites[0].dstSet = descriptorSets[descriptorId][0];
    descriptorWrites[0].dstBinding = 0;
    descriptorWrites[0].dstArrayElement = 0;
    descriptorWrites[0].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
    descriptorWrites[0].descriptorCount = 1;
    descriptorWrites[0].pBufferInfo = &bufferInfo;

    VkDescriptorImageInfo headPtrDescriptorImageInfo;
    headPtrDescriptorImageInfo.imageLayout = VK_IMAGE_LAYOUT_GENERAL;
    headPtrDescriptorImageInfo.imageView = headPtrTextureImageView;
    headPtrDescriptorImageInfo.sampler = headPtrTextureSampler;

    descriptorWrites[1].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
    descriptorWrites[1].dstSet = descriptorSets[descriptorId][0];
    descriptorWrites[1].dstBinding = 1;
    descriptorWrites[1].dstArrayElement = 0;
    descriptorWrites[1].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_IMAGE;
    descriptorWrites[1].descriptorCount = 1;
    descriptorWrites[1].pImageInfo = &headPtrDescriptorImageInfo;
    vkUpdateDescriptorSets(vkDevice.getDevice(), static_cast<uint32_t>(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr);
}
uint32_t findMemoryType(Device& vkDevice, uint32_t typeFilter, VkMemoryPropertyFlags properties) {
    VkPhysicalDeviceMemoryProperties memProperties;
    vkGetPhysicalDeviceMemoryProperties(vkDevice.getPhysicalDevice(), &memProperties);
    for (uint32_t i = 0; i < memProperties.memoryTypeCount; i++) {
        if ((typeFilter & (1 << i)) && (memProperties.memoryTypes[i].propertyFlags & properties) == properties) {
            return i;
        }
    }
    throw std::runtime_error("aucun type de memoire ne satisfait le buffer!");
}
void createBuffer(Device& vkDevice, VkDeviceSize size, VkBufferUsageFlags usage, VkMemoryPropertyFlags properties, VkBuffer& buffer, VkDeviceMemory& bufferMemory) {
    VkBufferCreateInfo bufferInfo{};
    bufferInfo.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO;
    bufferInfo.size = size;
    bufferInfo.usage = usage;
    bufferInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;

    if (vkCreateBuffer(vkDevice.getDevice(), &bufferInfo, nullptr, &buffer) != VK_SUCCESS) {
        throw std::runtime_error("failed to create buffer!");
    }

    VkMemoryRequirements memRequirements;
    vkGetBufferMemoryRequirements(vkDevice.getDevice(), buffer, &memRequirements);

    VkMemoryAllocateInfo allocInfo{};
    allocInfo.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
    allocInfo.allocationSize = memRequirements.size;
    allocInfo.memoryTypeIndex = findMemoryType(vkDevice, memRequirements.memoryTypeBits, properties);

    if (vkAllocateMemory(vkDevice.getDevice(), &allocInfo, nullptr, &bufferMemory) != VK_SUCCESS) {
        throw std::runtime_error("failed to allocate buffer memory!");
    }

    vkBindBufferMemory(vkDevice.getDevice(), buffer, bufferMemory, 0);
}
int main(int argc, char *argv[]) {
    VkSettup instance;
    Device device(instance);
    RenderWindow window(sf::VideoMode(800, 600), "test vulkan", device);
    window.getView().move(400, 300, 0);
    RenderTexture rtCubeMap(device);
    rtCubeMap.createCubeMap(800, 800);
    VertexBuffer vb(device);
    vb.setPrimitiveType(sf::Triangles);
    vb.append(Vertex(sf::Vector3f(0, 0, 0)));
    vb.append(Vertex(sf::Vector3f(800, 0, 0)));
    vb.append(Vertex(sf::Vector3f(800, 600, 0)));
    vb.append(Vertex(sf::Vector3f(0, 0, 0)));
    vb.append(Vertex(sf::Vector3f(800, 600, 0)));
    vb.append(Vertex(sf::Vector3f(0, 600, 0)));
    vb.update();


    VkImage headPtrTextureImage;
    VkImageCreateInfo imageInfo{};
    imageInfo.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO;
    imageInfo.imageType = VK_IMAGE_TYPE_3D;
    imageInfo.extent.width = static_cast<uint32_t>(window.getView().getSize().x);
    imageInfo.extent.height = static_cast<uint32_t>(window.getView().getSize().y);
    imageInfo.extent.depth = 6;
    imageInfo.mipLevels = 1;
    imageInfo.arrayLayers = 1;
    imageInfo.format = VK_FORMAT_R32_UINT;
    imageInfo.tiling = VK_IMAGE_TILING_OPTIMAL;
    imageInfo.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
    imageInfo.usage = VK_IMAGE_USAGE_TRANSFER_DST_BIT | VK_IMAGE_USAGE_SAMPLED_BIT | VK_IMAGE_USAGE_STORAGE_BIT;
    imageInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
    imageInfo.samples = VK_SAMPLE_COUNT_1_BIT;
    imageInfo.flags = 0; // Optionnel
    Shader linkedListShader(device);
    compileShaders(linkedListShader);

    if (vkCreateImage(device.getDevice(), &imageInfo, nullptr, &headPtrTextureImage) != VK_SUCCESS) {
        throw std::runtime_error("echec de la creation d'une image!");
    }

    VkMemoryRequirements memRequirements;
    vkGetImageMemoryRequirements(window.getDevice().getDevice(), headPtrTextureImage, &memRequirements);

    VkMemoryAllocateInfo allocInfo{};
    allocInfo.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
    allocInfo.allocationSize = memRequirements.size;
    allocInfo.memoryTypeIndex = findMemoryType(device, memRequirements.memoryTypeBits, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT);
    VkDeviceMemory headPtrTextureImageMemory;

    if (vkAllocateMemory(device.getDevice(), &allocInfo, nullptr, &headPtrTextureImageMemory) != VK_SUCCESS) {
        throw std::runtime_error("echec de l'allocation de la memoire d'une image!");
    }
    vkBindImageMemory(device.getDevice(), headPtrTextureImage, headPtrTextureImageMemory, 0);
    VkImageView headPtrTextureImageView;
    VkImageViewCreateInfo viewInfo{};
    viewInfo.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO;
    viewInfo.image = headPtrTextureImage;
    viewInfo.viewType = VK_IMAGE_VIEW_TYPE_3D;
    viewInfo.format = VK_FORMAT_R32_UINT;
    viewInfo.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
    viewInfo.subresourceRange.baseMipLevel = 0;
    viewInfo.subresourceRange.levelCount = 1;
    viewInfo.subresourceRange.baseArrayLayer = 0;
    viewInfo.subresourceRange.layerCount = 1;
    if (vkCreateImageView(device.getDevice(), &viewInfo, nullptr, &headPtrTextureImageView) != VK_SUCCESS) {
        throw std::runtime_error("failed to create head ptr texture image view!");
    }
    VkSampler headPtrTextureSampler;
    VkSamplerCreateInfo samplerInfo{};
    samplerInfo.sType = VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO;
    samplerInfo.magFilter = VK_FILTER_LINEAR;
    samplerInfo.minFilter = VK_FILTER_LINEAR;
    samplerInfo.addressModeU = VK_SAMPLER_ADDRESS_MODE_REPEAT;
    samplerInfo.addressModeV = VK_SAMPLER_ADDRESS_MODE_REPEAT;
    samplerInfo.addressModeW = VK_SAMPLER_ADDRESS_MODE_REPEAT;
    samplerInfo.anisotropyEnable = VK_TRUE;
    VkPhysicalDeviceProperties properties{};
    vkGetPhysicalDeviceProperties(device.getPhysicalDevice(), &properties);
    samplerInfo.maxAnisotropy = properties.limits.maxSamplerAnisotropy;
    samplerInfo.borderColor = VK_BORDER_COLOR_INT_OPAQUE_BLACK;
    samplerInfo.unnormalizedCoordinates = VK_FALSE;
    samplerInfo.compareEnable = VK_FALSE;
    samplerInfo.compareOp = VK_COMPARE_OP_ALWAYS;
    samplerInfo.mipmapMode = VK_SAMPLER_MIPMAP_MODE_LINEAR;
    samplerInfo.mipLodBias = 0.0f;
    samplerInfo.minLod = 0.0f;
    samplerInfo.maxLod = 0.0f;
    rtCubeMap.beginRecordCommandBuffers();
    VkImageMemoryBarrier barrier = {};
    barrier.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
    barrier.dstAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
    barrier.oldLayout = VK_IMAGE_LAYOUT_UNDEFINED;
    barrier.newLayout = VK_IMAGE_LAYOUT_GENERAL;
    barrier.image = headPtrTextureImage;
    barrier.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
    barrier.subresourceRange.levelCount = 1;
    barrier.subresourceRange.layerCount = 1;
    vkCmdPipelineBarrier(rtCubeMap.getCommandBuffers()[0], VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, 0, 0, nullptr, 0, nullptr, 1, &barrier);
    rtCubeMap.display();
    VkBuffer ubo;
    VkDeviceMemory uboMemory;
    createBuffer(device, sizeof(UniformBufferObject) , VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, ubo, uboMemory);
    UniformBufferObject uboData;
    uboData.model = Matrix4f();
    uboData.view = window.getView().getViewMatrix().getMatrix();
    uboData.proj = window.getView().getProjMatrix().getMatrix();
    void* data;
    vkMapMemory(device.getDevice(), uboMemory, 0, sizeof(UniformBufferObject), 0, &data);
    memcpy(data, &uboData, sizeof(UniformBufferObject));
    vkUnmapMemory(device.getDevice(), uboMemory);
    createDescriptorPoolLinkedList(device, linkedListShader, rtCubeMap);
    createDescriptorLayoutLinkedList(device, linkedListShader, rtCubeMap);
    allocateDescriptorSets(device, linkedListShader, rtCubeMap);
    createDescriptorSets(device, linkedListShader, rtCubeMap, headPtrTextureImage, headPtrTextureImageView, headPtrTextureSampler, ubo);
    if (vkCreateSampler(device.getDevice(), &samplerInfo, nullptr, &headPtrTextureSampler) != VK_SUCCESS) {
        throw std::runtime_error("failed to create texture sampler!");
    }
    RenderStates states;
    states.shader = &linkedListShader;
    rtCubeMap.createGraphicPipeline(sf::Triangles, states);
    while (window.isOpen()) {
        IEvent event;
        while (window.pollEvent(event)) {
            if (event.type == IEvent::WINDOW_EVENT && event.window.type == IEvent::WINDOW_EVENT_CLOSED) {
                window.close();
            }
        }
        rtCubeMap.clear(sf::Color::Transparent);
        VkClearColorValue clearColor;
        clearColor.uint32[0] = 0xffffffff;
        VkImageSubresourceRange subresRange = {};
        subresRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
        subresRange.levelCount = 1;
        subresRange.layerCount = 1;
        //transitionImageLayout(headPtrTextureImage, VK_FORMAT_R32_UINT, VK_IMAGE_LAYOUT_GENERAL, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL);
        vkCmdClearColorImage(rtCubeMap.getCommandBuffers()[0], headPtrTextureImage, VK_IMAGE_LAYOUT_GENERAL, &clearColor, 1, &subresRange);

        VkMemoryBarrier memoryBarrier;
        memoryBarrier.sType = VK_STRUCTURE_TYPE_MEMORY_BARRIER;
        memoryBarrier.pNext = VK_NULL_HANDLE;
        memoryBarrier.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
        memoryBarrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_SHADER_WRITE_BIT | VK_ACCESS_MEMORY_READ_BIT;
        vkCmdPipelineBarrier(rtCubeMap.getCommandBuffers()[0], VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, 0, 1, &memoryBarrier, 0, nullptr, 0, nullptr);
        unsigned int descriptorId = rtCubeMap.getId() * Shader::getNbShaders() + linkedListShader.getId();
        rtCubeMap.drawVertexBuffer(rtCubeMap.getCommandBuffers()[0], 0, vb, 0, states);
        rtCubeMap.display();
        window.clear(sf::Color::Black);
        window.display();
    }
    vkDestroyBuffer(device.getDevice(), ubo, nullptr);

    return 0;
    /*MyAppli app(sf::VideoMode(800, 600), "Test odfaeg");
    return app.exec();*/
}

Thanks.

I tried with an image memory barrier rather than a memory barrier but same issue.

Hi I forgot to say my graphic card is an nvidia 1660 super.

Hi! I tried to add the transfert waitstage to my semaphore :

void RenderTexture::display() {
            if (getCommandBuffers().size() > 0) {
                //std::cout<<"render texture end command buffer"<<std::endl;
                vkCmdEndRenderPass(getCommandBuffers()[getCurrentFrame()]);
                //for (unsigned int i = 0; i < getCommandBuffers().size(); i++) {
                    if (vkEndCommandBuffer(getCommandBuffers()[currentFrame]) != VK_SUCCESS) {
                        throw core::Erreur(0, "failed to record command buffer!", 1);
                    }
                //}
                vkWaitForFences(vkDevice.getDevice(), 1, &inFlightFences[currentFrame], VK_TRUE, UINT64_MAX);
                const uint64_t waitValue = value; // Wait until semaphore value is >= 2
                const uint64_t signalValue = value+1;

                VkTimelineSemaphoreSubmitInfo timelineInfo;
                timelineInfo.sType = VK_STRUCTURE_TYPE_TIMELINE_SEMAPHORE_SUBMIT_INFO;
                timelineInfo.pNext = NULL;
                timelineInfo.waitSemaphoreValueCount = 1;
                timelineInfo.pWaitSemaphoreValues = &waitValue;
                timelineInfo.signalSemaphoreValueCount = 1;
                timelineInfo.pSignalSemaphoreValues = &signalValue;


                VkSubmitInfo submitInfo{};
                submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO;



                VkSemaphore waitSemaphores[] = {renderFinishedSemaphores[currentFrame]};
                VkPipelineStageFlags waitStages[] = {VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT};
                submitInfo.pNext = &timelineInfo;
                submitInfo.waitSemaphoreCount = 1;
                submitInfo.pWaitSemaphores = waitSemaphores;
                submitInfo.pWaitDstStageMask = waitStages;
                submitInfo.commandBufferCount = 1;
                submitInfo.pCommandBuffers = &getCommandBuffers()[currentFrame];
                submitInfo.signalSemaphoreCount = 1;
                submitInfo.pSignalSemaphores = waitSemaphores;


                vkResetFences(vkDevice.getDevice(), 1, &inFlightFences[currentFrame]);
                if (vkQueueSubmit(vkDevice.getGraphicsQueue(), 1, &submitInfo, inFlightFences[currentFrame]) != VK_SUCCESS) {
                    throw core::Erreur(0, "échec de l'envoi d'un command buffer!", 1);
                }
                vkDeviceWaitIdle(vkDevice.getDevice());
                value++;
            }
        }

But doesn’t works …

rtCubeMap.clear(sf::Color::Transparent);
        rtCubeMap.display();
        rtCubeMap.beginRecordCommandBuffers();
        VkClearColorValue clearColor;
        clearColor.uint32[0] = 0xffffffff;
        VkImageSubresourceRange subresRange = {};
        subresRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
        subresRange.levelCount = 1;
        subresRange.layerCount = 1;
        //transitionImageLayout(headPtrTextureImage, VK_FORMAT_R32_UINT, VK_IMAGE_LAYOUT_GENERAL, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL);
        vkCmdClearColorImage(rtCubeMap.getCommandBuffers()[0], headPtrTextureImage, VK_IMAGE_LAYOUT_GENERAL, &clearColor, 1, &subresRange);
        VkImageMemoryBarrier barrier2 = {};

        barrier2.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
        barrier2.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
        barrier2.dstAccessMask = VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_SHADER_WRITE_BIT;
        barrier2.oldLayout = VK_IMAGE_LAYOUT_GENERAL;
        barrier2.newLayout = VK_IMAGE_LAYOUT_GENERAL;
        barrier2.image = headPtrTextureImage;
        barrier2.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
        barrier2.subresourceRange.levelCount = 1;
        barrier2.subresourceRange.layerCount = 1;

        vkCmdPipelineBarrier(rtCubeMap.getCommandBuffers()[0], VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, 0, 0, nullptr, 0, nullptr, 1, &barrier2);
        rtCubeMap.beginRenderPass();
        rtCubeMap.display();
        unsigned int descriptorId = rtCubeMap.getId() * Shader::getNbShaders() + linkedListShader.getId();
        rtCubeMap.beginRecordCommandBuffers();
        rtCubeMap.beginRenderPass();
        rtCubeMap.drawVertexBuffer(rtCubeMap.getCommandBuffers()[0], 0, vb, 0, states);
        rtCubeMap.display();
        window.clear(sf::Color::Black);
        window.display();

I don’t really understand how timeline semaphores works, it add a value each time we wait or each time a semaphore is signaled ?

and the semaphore is signaled when the values in semaphore info are reached ?

I need a timeline semaphore because I need the semaphore to be signaled by the host and this is not possible with binary semaphores.

Ok tried this :

void RenderTexture::display(bool isSignalSemaphore, VkPipelineStageFlags2 stageMask) {
            if (getCommandBuffers().size() > 0) {
                //std::cout<<"render texture end command buffer"<<std::endl;
                vkCmdEndRenderPass(getCommandBuffers()[getCurrentFrame()]);
                //for (unsigned int i = 0; i < getCommandBuffers().size(); i++) {
                    if (vkEndCommandBuffer(getCommandBuffers()[currentFrame]) != VK_SUCCESS) {
                        throw core::Erreur(0, "failed to record command buffer!", 1);
                    }
                //}
                vkWaitForFences(vkDevice.getDevice(), 1, &inFlightFences[currentFrame], VK_TRUE, UINT64_MAX);

                VkSemaphoreSubmitInfo timelineInfo = {};
                VkSubmitInfo2 submitInfo = {};
                if (isSignalSemaphore) {
                    const uint64_t signalValue = 1;
                    timelineInfo.sType = VK_STRUCTURE_TYPE_SEMAPHORE_SUBMIT_INFO;
                    timelineInfo.semaphore = renderFinishedSemaphores[currentFrame];
                    timelineInfo.value = signalValue;
                    timelineInfo.stageMask = stageMask;
                    submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO_2;
                    submitInfo.signalSemaphoreInfoCount = 1;
                    submitInfo.pSignalSemaphoreInfos = &timelineInfo;
                } else {
                    const uint64_t waitValue = 1;
                    timelineInfo.sType = VK_STRUCTURE_TYPE_SEMAPHORE_SUBMIT_INFO;
                    timelineInfo.semaphore = renderFinishedSemaphores[currentFrame];
                    timelineInfo.value = waitValue;
                    timelineInfo.stageMask = stageMask;
                    submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO_2;
                    submitInfo.waitSemaphoreInfoCount = 1;
                    submitInfo.pWaitSemaphoreInfos = &timelineInfo;
                }
                submitInfo.commandBufferInfoCount = 1;
                VkCommandBufferSubmitInfo commandSubmitInfo = {};
                commandSubmitInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_SUBMIT_INFO;
                commandSubmitInfo.commandBuffer = getCommandBuffers()[currentFrame];
                submitInfo.pCommandBufferInfos = &commandSubmitInfo;
                vkResetFences(vkDevice.getDevice(), 1, &inFlightFences[currentFrame]);
                if (vkQueueSubmit2(vkDevice.getGraphicsQueue(), 1, &submitInfo, inFlightFences[currentFrame]) != VK_SUCCESS) {
                    throw core::Erreur(0, "échec de l'envoi d'un command buffer!", 1);
                }
                vkDeviceWaitIdle(vkDevice.getDevice());
            }
        }
rtCubeMap.clear(sf::Color::Transparent);
        rtCubeMap.display();
        rtCubeMap.beginRecordCommandBuffers();
        VkClearColorValue clearColor;
        clearColor.uint32[0] = 0xffffffff;
        VkImageSubresourceRange subresRange = {};
        subresRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
        subresRange.levelCount = 1;
        subresRange.layerCount = 1;
        //transitionImageLayout(headPtrTextureImage, VK_FORMAT_R32_UINT, VK_IMAGE_LAYOUT_GENERAL, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL);
        vkCmdClearColorImage(rtCubeMap.getCommandBuffers()[0], headPtrTextureImage, VK_IMAGE_LAYOUT_GENERAL, &clearColor, 1, &subresRange);
        VkImageMemoryBarrier barrier2 = {};

        barrier2.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
        barrier2.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
        barrier2.dstAccessMask = VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_SHADER_WRITE_BIT;
        barrier2.oldLayout = VK_IMAGE_LAYOUT_GENERAL;
        barrier2.newLayout = VK_IMAGE_LAYOUT_GENERAL;
        barrier2.image = headPtrTextureImage;
        barrier2.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
        barrier2.subresourceRange.levelCount = 1;
        barrier2.subresourceRange.layerCount = 1;

        vkCmdPipelineBarrier(rtCubeMap.getCommandBuffers()[0], VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, 0, 0, nullptr, 0, nullptr, 1, &barrier2);
        rtCubeMap.beginRenderPass();
        rtCubeMap.display(true, VK_PIPELINE_STAGE_2_CLEAR_BIT);
        unsigned int descriptorId = rtCubeMap.getId() * Shader::getNbShaders() + linkedListShader.getId();
        rtCubeMap.beginRecordCommandBuffers();
        rtCubeMap.beginRenderPass();
        rtCubeMap.drawVertexBuffer(rtCubeMap.getCommandBuffers()[0], 0, vb, 0, states);
        rtCubeMap.display(false, VK_PIPELINE_STAGE_2_FRAGMENT_SHADER_BIT);

But the shader is clearly not waiting than the clearing is finished…

Ok there is something strange, I’ve some of my textures which are not updated when I write with a fragment shader and then I read with another fragment shader, I tried to use synchronization but it doesn’t solve the issue, I also verified my descriptors, image creation and everyhting but it seems to be correct.

The strange thing is for some textures it works for others not…, the behaviour is quite random…

I tried to switch on linux with nvidia proprietary driver but same issue…

Ok for the alpha texture who was not read, I put wrong texture coordinates, and I had to use 0 as clear values of my headptr texture instead of -1 because of this bug, but I don’t think it clears my headptr tex well for the next frame…

Ok finally I solved it, I called vkCmdClear with on command pool and I readed the value in the shader with another command pool and buffer. The bug was when I used the same command buffer to clear the image and to read it from the shader…, it seems the driver doesn’t synchronize the commands, even with a memory barrier and semaphore but by using two different command buffers it force the driver to synchronize :

Le bug venait probablement du fait que le GPU “pipelinisait” trop agressivement le clear et la lecture — sans voir de vraie dépendance entre les deux dans un même buffer. Le découpage en deux soumissions force cette dépendance à être visible.

Woups it was not the bug, the bug is when I use a renderpass with a multiview to read the image values, some image values are not cleared. But when I don’t use a renderpass with a multiview it works…

Hi @laurentduroisin,

You have pasted a lot of code snippets in here, so it is a bit difficult to follow.

The first step is to make sure your application runs cleanly with the Vulkan Validation Layers enabled.

If that doesn’t help, I encourage you to use NSight Graphics for debugging. This should give you a better understanding of the problem.

If you need help fixing some application bug, you may find better community support in more crowded places such as the Vulkan discord server.

However, if you believe you have found a driver bug, you will have better luck getting help with it here by providing code for a minimal standalone application showcasing the issue.

I hope that helps,

Lionel