vulkanscenegraph显示倾斜模型(6.2)-记录与提交
前言
上章通过代码深入分析了帧循环中事件处理与更新两个关键处理流程。本章将继续分析帧循环过程中后续内容:记录与提交。
目录
- 1 记录与提交
1 记录与提交
vsg_viewer->recordAndSubmit()
vsg::Viewer封装recordAndSubmit方法,用于记录与提交。
void Viewer::recordAndSubmit()
{CPU_INSTRUMENTATION_L1_NC(instrumentation, "Viewer recordAndSubmitTask", COLOR_VIEWER);// reset connected ExecuteCommandsfor (const auto& recordAndSubmitTask : recordAndSubmitTasks){for (auto& commandGraph : recordAndSubmitTask->commandGraphs){commandGraph->reset();}}#if 1if (_threading)
#else// The following is a workaround for an odd "Possible data race during write of size 1" warning that valgrind tool=helgrind reports// on the first call to vkBeginCommandBuffer despite them being done on independent command buffers. This could well be a driver bug or a false positive.// If you want to quieten this warning then change the #if above to #if 0 as rendering the first three frames single threaded avoids the warning.if (_threading && _frameStamp->frameCount > 2)
#endif{_frameBlock->set(_frameStamp);_submissionCompleted->arrive_and_wait();}else{for (auto& recordAndSubmitTask : recordAndSubmitTasks){recordAndSubmitTask->submit(_frameStamp);}}
}
上述代码为vsg::Viewer::recordAndSubmit的具体实现,包含CommandGraph重置、任务提交,其中任务提交包含主线程提交和多线程提交两种类型,当_threading标记为true时,使用多线程方式进行任务提交,多线程提交任务的细节将在后续章深入分析。
VkResult RecordAndSubmitTask::submit(ref_ptr<FrameStamp> frameStamp)
{CPU_INSTRUMENTATION_L1_NC(instrumentation, "RecordAndSubmitTask submit", COLOR_RECORD);//info("\nRecordAndSubmitTask::submit()");if (VkResult result = start(); result != VK_SUCCESS) return result;if (transferTask){if (auto transfer = transferTask->transferData(TransferTask::TRANSFER_BEFORE_RECORD_TRAVERSAL); transfer.result == VK_SUCCESS){if (transfer.dataTransferredSemaphore){//info(" adding early transfer dataTransferredSemaphore ", transfer.dataTransferredSemaphore);earlyDataTransferredSemaphore = transfer.dataTransferredSemaphore;}}else{return transfer.result;}}auto recordedCommandBuffers = RecordedCommandBuffers::create();if (VkResult result = record(recordedCommandBuffers, frameStamp); result != VK_SUCCESS) return result;return finish(recordedCommandBuffers);
}
上述代码为任务提交的具体实现,包含开始、recordTraversal前的数据传输、record、完成四个部分。
1.1 开始
VkResult RecordAndSubmitTask::start()
{CPU_INSTRUMENTATION_L1_NC(instrumentation, "RecordAndSubmitTask start", COLOR_RECORD);earlyDataTransferredSemaphore.reset();lateDataTransferredSemaphore.reset();auto current_fence = fence();if (current_fence->hasDependencies()){//info("RecordAndSubmitTask::start() waiting on fence ", current_fence, ", ", current_fence->status(), ", current_fence->hasDependencies() = ", current_fence->hasDependencies());uint64_t timeout = std::numeric_limits<uint64_t>::max();if (VkResult result = current_fence->wait(timeout); result != VK_SUCCESS) return result;current_fence->resetFenceAndDependencies();//info("after RecordAndSubmitTask::start() waited on fence ", current_fence, ", ", current_fence->status(), ", current_fence->hasDependencies() = ", current_fence->hasDependencies());}else{//info("RecordAndSubmitTask::start() initial fence ", current_fence, ", ", current_fence->status(), ", current_fence->hasDependencies() = ", current_fence->hasDependencies());}return VK_SUCCESS;
}
earlyDataTransferredSemaphore与lateDataTransferredSemaphore两个信号量重置,两者状态分别表示 recordTraversal前的数据传输和 recordTraversal前的数据传输后的数据传输是否完成。栅栏同步,首先获取当前帧状态下的栅栏,当前帧状态下的栅栏通过轮询方式获取(vulkanscenegraph显示倾斜模型(6)-帧循环-CSDN博客),接着等待所依赖的信号量的信号量发出以及依赖的CommandBuffer完成。其中hasDependencies函数实现如下:
bool hasDependencies() const { return (_dependentSemaphores.size() + _dependentCommandBuffers.size()) > 0; }
上述代码表示,判断依赖的信号量数量与依赖的CommandBuffer数量之和是否大于0。
1.2 recordTraversal前的数据传输
if (transferTask){if (auto transfer = transferTask->transferData(TransferTask::TRANSFER_BEFORE_RECORD_TRAVERSAL); transfer.result == VK_SUCCESS){if (transfer.dataTransferredSemaphore){//info(" adding early transfer dataTransferredSemaphore ", transfer.dataTransferredSemaphore);earlyDataTransferredSemaphore = transfer.dataTransferredSemaphore;}}else{return transfer.result;}}
vsg中封装了vsg::TransferTask用于数据传输,在视景器编译(vsg::Viewer::Compile函数)和数据动态加载(vsg::DatabasePager用于数据动态加载)后,实现数据的编译,当vsg::TransferTask非空时,程序会将待传输的数据信息记录至该对象中(vulkanscenegraph显示倾斜模型(5.10)-vsg::TransferTask-CSDN博客)。在帧循环过程中实现数据传输,确保每帧绘制正确的内容。
1.3 record
VkResult RecordAndSubmitTask::record(ref_ptr<RecordedCommandBuffers> recordedCommandBuffers, ref_ptr<FrameStamp> frameStamp)
{CPU_INSTRUMENTATION_L1_NC(instrumentation, "RecordAndSubmitTask record", COLOR_RECORD);for (auto& commandGraph : commandGraphs){commandGraph->record(recordedCommandBuffers, frameStamp, databasePager);}return VK_SUCCESS;
}
上述代码为record部分的具体实现,所有commandGraph逐个执行record方法。
如上图所示,vulkanscenegraph显示倾斜模型5.5-5.7中分别剖析了vsg场景树中的vsg::CommandGraph、vsg::RenderGraph、vsg::StateGroup节点。vsg::RecordTraversal负责遍历vsg场景图,vsg::CommandGraph将其场景子图的遍历嵌入在vkBeginCommandBuffer与vkEndCommandBuffer之间,从而实现命令的录制,vsg::RenderGraph将其场景子图的遍历嵌入在vkCmdBeginRenderPass和vkCmdEndRenderPass之间,实现了vulkan命令在VkRenderPass的范围内录制,vsg::StateGroup通过绑定StateCommand,从将其应用于其场景子图,其中vsg::GraphicsPipeline是vsg对图形渲染管线的封装,通过绑定vsg::GraphicsPipeline,实现图元的绘制命令录制。
1.4 完成
VkResult RecordAndSubmitTask::finish(ref_ptr<RecordedCommandBuffers> recordedCommandBuffers)
{CPU_INSTRUMENTATION_L1_NC(instrumentation, "RecordAndSubmitTask finish", COLOR_RECORD);//info("RecordAndSubmitTask::finish()");auto current_fence = fence();if (transferTask){auto transfer = transferTask->transferData(TransferTask::TRANSFER_AFTER_RECORD_TRAVERSAL);if (transfer.result == VK_SUCCESS){if (transfer.dataTransferredSemaphore){//info(" adding late transfer dataTransferredSemaphore ", transfer.dataTransferredSemaphore);lateDataTransferredSemaphore = transfer.dataTransferredSemaphore;}}else{return transfer.result;}}if (recordedCommandBuffers->empty()){if (earlyDataTransferredSemaphore) transferTask->assignTransferConsumedCompletedSemaphore(TransferTask::TRANSFER_BEFORE_RECORD_TRAVERSAL, earlyDataTransferredSemaphore);if (lateDataTransferredSemaphore) transferTask->assignTransferConsumedCompletedSemaphore(TransferTask::TRANSFER_AFTER_RECORD_TRAVERSAL, lateDataTransferredSemaphore);// nothing to do so return earlystd::this_thread::sleep_for(std::chrono::milliseconds(16)); // sleep for 1/60th of a secondreturn VK_SUCCESS;}// convert VSG CommandBuffer to Vulkan handles and add to the Fence's list of dependent CommandBuffersstd::vector<VkCommandBuffer> vk_commandBuffers;std::vector<VkSemaphore> vk_waitSemaphores;std::vector<VkPipelineStageFlags> vk_waitStages;std::vector<VkSemaphore> vk_signalSemaphores;// convert VSG CommandBuffer to Vulkan handles and add to the Fence's list of dependent CommandBuffersauto buffers = recordedCommandBuffers->buffers();for (auto& commandBuffer : buffers){if (commandBuffer->level() == VK_COMMAND_BUFFER_LEVEL_PRIMARY) vk_commandBuffers.push_back(*commandBuffer);current_fence->dependentCommandBuffers().emplace_back(commandBuffer);}if (earlyDataTransferredSemaphore){vk_waitSemaphores.emplace_back(*earlyDataTransferredSemaphore);vk_waitStages.emplace_back(earlyDataTransferredSemaphore->pipelineStageFlags());}if (lateDataTransferredSemaphore){vk_waitSemaphores.emplace_back(*lateDataTransferredSemaphore);vk_waitStages.emplace_back(lateDataTransferredSemaphore->pipelineStageFlags());}if (earlyDataTransferredSemaphore) transferTask->assignTransferConsumedCompletedSemaphore(TransferTask::TRANSFER_BEFORE_RECORD_TRAVERSAL, earlyTransferConsumerCompletedSemaphore);if (lateDataTransferredSemaphore) transferTask->assignTransferConsumedCompletedSemaphore(TransferTask::TRANSFER_AFTER_RECORD_TRAVERSAL, lateTransferConsumerCompletedSemaphore);for (auto& window : windows){auto imageIndex = window->imageIndex();if (imageIndex >= window->numFrames()) continue;auto& semaphore = window->frame(imageIndex).imageAvailableSemaphore;vk_waitSemaphores.emplace_back(*semaphore);vk_waitStages.emplace_back(semaphore->pipelineStageFlags());}for (auto& semaphore : waitSemaphores){vk_waitSemaphores.emplace_back(*(semaphore));vk_waitStages.emplace_back(semaphore->pipelineStageFlags());}current_fence->dependentSemaphores() = signalSemaphores;for (auto& semaphore : signalSemaphores){vk_signalSemaphores.emplace_back(*(semaphore));}if (earlyDataTransferredSemaphore){vk_signalSemaphores.emplace_back(earlyTransferConsumerCompletedSemaphore->vk());current_fence->dependentSemaphores().push_back(earlyTransferConsumerCompletedSemaphore);}if (lateDataTransferredSemaphore){vk_signalSemaphores.emplace_back(lateTransferConsumerCompletedSemaphore->vk());current_fence->dependentSemaphores().push_back(lateTransferConsumerCompletedSemaphore);}VkSubmitInfo submitInfo = {};submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO;submitInfo.waitSemaphoreCount = static_cast<uint32_t>(vk_waitSemaphores.size());submitInfo.pWaitSemaphores = vk_waitSemaphores.data();submitInfo.pWaitDstStageMask = vk_waitStages.data();submitInfo.commandBufferCount = static_cast<uint32_t>(vk_commandBuffers.size());submitInfo.pCommandBuffers = vk_commandBuffers.data();submitInfo.signalSemaphoreCount = static_cast<uint32_t>(vk_signalSemaphores.size());submitInfo.pSignalSemaphores = vk_signalSemaphores.data();return queue->submit(submitInfo, current_fence);
}
上述代码主要包含两个过程:recordTraversal后的数据传输和队列提交。
if (transferTask){auto transfer = transferTask->transferData(TransferTask::TRANSFER_AFTER_RECORD_TRAVERSAL);if (transfer.result == VK_SUCCESS){if (transfer.dataTransferredSemaphore){//info(" adding late transfer dataTransferredSemaphore ", transfer.dataTransferredSemaphore);lateDataTransferredSemaphore = transfer.dataTransferredSemaphore;}}else{return transfer.result;}}
上述为recordTraversal后的数据传输的具体实现,用于处理vsg::RecordTraversal遍历场景树后,实现CPU和GPU数据同步(数据传输)。
VkResult Queue::submit(const VkSubmitInfo& submitInfo, Fence* fence)
{std::scoped_lock<std::mutex> guard(_mutex);return vkQueueSubmit(_vkQueue, 1, &submitInfo, fence ? fence->vk() : VK_NULL_HANDLE);
}
提交队列通过调用vsg::Queue::submit函数实现,但在此之前需做好一系列的同步操作。同步通过vsg::Fence和vsg::Semaphore实现,其分别是对VkFence和VkSemaphore的封装。
上图说明了帧循环过程中,记录与提交的内容,提交阶段vsg::Queue::submit调用前需做好一系列的同步操作。同步通过vsg::Semaphore和vsg::Fence实现,其中VkSubmitInfo的创建如下:
VkSubmitInfo submitInfo = {};submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO;submitInfo.waitSemaphoreCount = static_cast<uint32_t>(vk_waitSemaphores.size());submitInfo.pWaitSemaphores = vk_waitSemaphores.data();//等待信号量submitInfo.pWaitDstStageMask = vk_waitStages.data();submitInfo.commandBufferCount = static_cast<uint32_t>(vk_commandBuffers.size());submitInfo.pCommandBuffers = vk_commandBuffers.data();submitInfo.signalSemaphoreCount = static_cast<uint32_t>(vk_signalSemaphores.size());submitInfo.pSignalSemaphores = vk_signalSemaphores.data();//通知信号量
包含等待信号量和通知信号量两种信号。
if (earlyDataTransferredSemaphore){vk_waitSemaphores.emplace_back(*earlyDataTransferredSemaphore);recordTraversal前的数据传输完成信号量vk_waitStages.emplace_back(earlyDataTransferredSemaphore->pipelineStageFlags());}if (lateDataTransferredSemaphore){vk_waitSemaphores.emplace_back(*lateDataTransferredSemaphore);recordTraversal后的数据传输完成信号量vk_waitStages.emplace_back(lateDataTransferredSemaphore->pipelineStageFlags());}if (earlyDataTransferredSemaphore) transferTask->assignTransferConsumedCompletedSemaphore(TransferTask::TRANSFER_BEFORE_RECORD_TRAVERSAL, earlyTransferConsumerCompletedSemaphore);if (lateDataTransferredSemaphore) transferTask->assignTransferConsumedCompletedSemaphore(TransferTask::TRANSFER_AFTER_RECORD_TRAVERSAL, lateTransferConsumerCompletedSemaphore);for (auto& window : windows){auto imageIndex = window->imageIndex();if (imageIndex >= window->numFrames()) continue;auto& semaphore = window->frame(imageIndex).imageAvailableSemaphore;vk_waitSemaphores.emplace_back(*semaphore);//前进到下一帧图像准备完成信号量vk_waitStages.emplace_back(semaphore->pipelineStageFlags());}for (auto& semaphore : waitSemaphores){vk_waitSemaphores.emplace_back(*(semaphore));//应用层设置的等待信号量vk_waitStages.emplace_back(semaphore->pipelineStageFlags());}
如上代码,等待信号量的组成包括:recordTraversal前的数据传输完成信号、recordTraversal后的数据传输完成信号、前进到下一帧图像准备完成信号、应用层设置的等待信号量。
而通知信号量用于通知下一阶段,即通知队列提交已完成。
VkResult Queue::submit(const VkSubmitInfo& submitInfo, Fence* fence)
{std::scoped_lock<std::mutex> guard(_mutex);return vkQueueSubmit(_vkQueue, 1, &submitInfo, fence ? fence->vk() : VK_NULL_HANDLE);
}
如上代码为vsg::Fence使用场景,vkQueueSubmit 函数的 VkFence 参数是一个 GPU-to-CPU 同步机制,用于通知 CPU 某个队列提交(包含一个或多个 Command Buffer)的 GPU 执行是否完成。在开始阶段(RecordAndSubmitTask::start())的栅栏同步阶段,会等待栅栏的完成,从而保证队列提交和开始阶段的执行顺序,即需等待队列提交完成后,才可开始下一轮的开始阶段。
文末:本章深入分析了帧循环中的记录与提交过程,其以记录与提交任务(vsg::RecordAndSubmitTask)为粒度执行记录与提交,包含开始、recordTraversal前的数据传输、record、完成四个阶段,其中完成阶段包含recordTraversal后的数据传输、提交两个子阶段。在提交子阶段通过vsg::Semaphore和vsg::Fence实现了一系列的复杂同步操作,确保不同阶段的执行顺序的正确性。下章将分析帧循环中的呈现阶段。
待分析项:vsg::DatabasePager在更新场景树过程中的作用、多线程下的记录与提交。