Fix D3D12 NanoVDB volume load stalls
This commit is contained in:
287
docs/plan/Shader预编译缓存与D3D12字节码正式化计划_2026-04-10.md
Normal file
287
docs/plan/Shader预编译缓存与D3D12字节码正式化计划_2026-04-10.md
Normal file
@@ -0,0 +1,287 @@
|
||||
# Shader 预编译缓存与 D3D12 字节码正式化计划
|
||||
|
||||
文档日期:2026-04-10
|
||||
|
||||
适用范围:`XCEngine` 当前 `Resources / Asset / Rendering / RHI / Editor` 主线,目标问题为编辑器与运行时虽然已经具备 shader artifact 与 variant 系统,但 D3D12 路径仍普遍在运行时现场 `D3DCompile(...)`,缺少真正可复用的编译字节码缓存链路。
|
||||
|
||||
关联归档:
|
||||
[NanoVDB体积云加载阻塞与Runtime上传修复计划_完成归档_2026-04-10.md](/D:/Xuanchi/Main/XCEngine/docs/used/NanoVDB体积云加载阻塞与Runtime上传修复计划_完成归档_2026-04-10.md)
|
||||
|
||||
---
|
||||
|
||||
## 1. 当前结论
|
||||
|
||||
本轮 `NanoVDB` 卡顿排查已经把“资源读入/上传”与“shader 编译”分开钉死:
|
||||
|
||||
1. `main.xcvol` artifact 载入已经压到亚秒级。
|
||||
2. `cloud.nvdb` 对应 payload 的 CPU 读入和 GPU 上传不再是主瓶颈。
|
||||
3. 当前剩余大头是 D3D12 图形管线创建时的运行时 shader 编译。
|
||||
|
||||
最新 editor 实测:
|
||||
|
||||
- `VolumeFieldLoader Artifact total_ms = 674`
|
||||
- `AsyncVolumeLoadCompleted elapsed_ms = 909`
|
||||
- `UploadVolumeField total_ms = 736`
|
||||
- `D3D12 shader compile MainPS total_ms = 6205`
|
||||
- `SceneReady elapsed_ms = 8500`
|
||||
|
||||
这说明当前主线问题已经从“体积资源路径错误”切换成“D3D12 shader 运行时现场编译没有被缓存机制覆盖”。
|
||||
|
||||
---
|
||||
|
||||
## 2. 根本问题
|
||||
|
||||
当前工程里虽然有 shader artifact,也预留了 `compiledBinary` 字段,但整条链路没有真正闭合。
|
||||
|
||||
### 2.1 已有但未闭环的部分
|
||||
|
||||
1. `ShaderStageVariant` 有 `compiledBinary` 字段。
|
||||
2. shader artifact 文件格式会序列化/反序列化 `compiledBinary`。
|
||||
3. shader runtime build 也会统计 `compiledBinary` 的内存占用。
|
||||
|
||||
### 2.2 真正缺失的部分
|
||||
|
||||
1. 没有任何正式步骤把 D3D12 编译结果写入 `variant.compiledBinary`。
|
||||
2. 没有任何 D3D12 运行时路径优先消费 `compiledBinary`。
|
||||
3. `D3D12Device::CreatePipelineState(...)` 仍然直接对 `ShaderCompileDesc` 走 `D3DCompile(...)`。
|
||||
4. shader artifact 现在缓存的是源码展开结果和变体描述,不是可直接用于 D3D12 的 DXBC/DXIL。
|
||||
|
||||
因此当前的真实状态不是“缓存偶尔失效”,而是:
|
||||
|
||||
`D3D12 shader 预编译缓存机制整体没有真正落地。`
|
||||
|
||||
`NanoVDB` 只是把这个系统性缺口最先炸出来,因为它的 `MainPS` 足够重。
|
||||
|
||||
---
|
||||
|
||||
## 3. 目标
|
||||
|
||||
## 3.1 一级目标
|
||||
|
||||
让 D3D12 路径对同一 shader variant 不再反复运行时现场编译。
|
||||
|
||||
要求:
|
||||
|
||||
1. 首次编译后能够落盘保存 D3D12 可复用字节码。
|
||||
2. 再次打开 editor / scene / project 时优先命中缓存。
|
||||
3. `CreatePipelineState(...)` 在缓存命中时不再进入 `D3DCompile(...)`。
|
||||
|
||||
## 3.2 二级目标
|
||||
|
||||
把这套机制做成通用能力,而不是只给 `volumetric.shader` 打补丁。
|
||||
|
||||
要求:
|
||||
|
||||
1. 面向所有 D3D12 shader variant。
|
||||
2. 与现有 shader artifact/variant 系统兼容。
|
||||
3. 缓存失效规则明确,可随源码、profile、macro、backend、entry point 正确失效。
|
||||
|
||||
## 3.3 三级目标
|
||||
|
||||
为后续 Vulkan / OpenGL / DXC 路径保留统一设计空间。
|
||||
|
||||
本轮不要求一次做完多后端,但数据模型和接口命名不能把后续扩展堵死。
|
||||
|
||||
---
|
||||
|
||||
## 4. 非目标
|
||||
|
||||
本轮不做以下内容:
|
||||
|
||||
1. 不重写整个 shader authoring 系统。
|
||||
2. 不直接切 DXC 全量替换 FXC。
|
||||
3. 不先做跨后端统一离线编译工具链。
|
||||
4. 不先做完整 PSO cache blob 体系。
|
||||
5. 不先解决所有 shader 首次编译耗时,只先解决“重复现场编译”这个系统性缺口。
|
||||
|
||||
---
|
||||
|
||||
## 5. 正式实施方向
|
||||
|
||||
## 5.1 方向一:明确 D3D12 预编译产物的数据模型
|
||||
|
||||
要先统一“什么叫缓存命中”。
|
||||
|
||||
建议把 D3D12 预编译缓存键固定为以下信息的稳定组合:
|
||||
|
||||
1. shader 资源路径
|
||||
2. pass name
|
||||
3. stage
|
||||
4. backend
|
||||
5. source language
|
||||
6. entry point
|
||||
7. profile
|
||||
8. macro 集合
|
||||
9. 变体关键字集合
|
||||
10. 源码内容哈希
|
||||
|
||||
产物:
|
||||
|
||||
1. 对应 stage 的已编译 D3D12 字节码
|
||||
2. 可选的编译日志/调试信息版本标识
|
||||
3. 可选的 shader reflection 派生数据版本号
|
||||
|
||||
这一步的目的不是立即提速,而是先避免后面做出“缓存看起来有,实际上命不中或误命中”的半成品。
|
||||
|
||||
## 5.2 方向二:把 `compiledBinary` 从占位字段变成真实产物
|
||||
|
||||
当前 `compiledBinary` 只是格式字段,不是真正能力。
|
||||
|
||||
本阶段要做:
|
||||
|
||||
1. D3D12 编译成功后把字节码写回 `ShaderStageVariant::compiledBinary`
|
||||
2. shader artifact 写出时携带该字节码
|
||||
3. 重新载入 shader artifact 时恢复该字节码
|
||||
|
||||
要求:
|
||||
|
||||
1. 同一 variant 的 artifact 可以直接携带 D3D12 字节码
|
||||
2. 非 D3D12 后端不会被这套数据污染
|
||||
3. 缓存不存在时依然允许 fallback 到现场编译
|
||||
|
||||
## 5.3 方向三:D3D12 runtime 优先消费已编译字节码
|
||||
|
||||
这是运行时闭环的关键。
|
||||
|
||||
需要调整的不是 asset 层,而是 D3D12 runtime 创建 shader / pipeline 的入口:
|
||||
|
||||
1. `CreateShader(...)`
|
||||
2. `CreatePipelineState(...)`
|
||||
3. 任何内部 `CompileD3D12Shader(...)` 的调用链
|
||||
|
||||
优先级顺序应为:
|
||||
|
||||
1. 命中 `compiledBinary`,直接创建 shader bytecode
|
||||
2. 未命中时才走 `D3DCompile(...)`
|
||||
3. 首次现场编译成功后可回写缓存
|
||||
|
||||
验收标准:
|
||||
|
||||
1. 对同一项目二次启动时,`MainPS` 不再重新编译
|
||||
2. 日志里能明确区分 `cache_hit` / `cache_miss` / `runtime_compile`
|
||||
|
||||
## 5.4 方向四:把缓存失效规则正式化
|
||||
|
||||
没有失效规则,缓存就会变成隐患。
|
||||
|
||||
必须纳入失效判定的至少包括:
|
||||
|
||||
1. shader 源文件内容变化
|
||||
2. `#include` 展开结果变化
|
||||
3. entry point 变化
|
||||
4. profile 变化
|
||||
5. macro 变化
|
||||
6. backend 变化
|
||||
7. variant keyword 变化
|
||||
8. shader artifact schema version 变化
|
||||
|
||||
这一阶段必须输出一套明确规则,避免后面出现“改了 shader 却继续吃旧字节码”的错误。
|
||||
|
||||
## 5.5 方向五:引入可验证的日志与测试
|
||||
|
||||
这次 `NanoVDB` 能钉死问题,靠的是可计时日志,不是猜。
|
||||
|
||||
shader 预编译缓存正式化之后,也必须保留最小必要日志:
|
||||
|
||||
1. cache key
|
||||
2. cache hit/miss
|
||||
3. runtime compile elapsed
|
||||
4. binary load elapsed
|
||||
5. fallback reason
|
||||
|
||||
测试层至少要覆盖:
|
||||
|
||||
1. 首次冷启动:miss + compile + write
|
||||
2. 第二次热启动:hit + no compile
|
||||
3. 改 shader 源码后:缓存失效 + 重新编译
|
||||
4. 切换 profile / macro / keyword 后:缓存失效
|
||||
|
||||
---
|
||||
|
||||
## 6. 分阶段执行计划
|
||||
|
||||
## Phase 0:基线与接口盘点
|
||||
|
||||
任务:
|
||||
|
||||
1. 梳理 `ShaderStageVariant::compiledBinary` 的现状用途
|
||||
2. 盘点 D3D12 shader / pipeline 创建的全部编译入口
|
||||
3. 定义统一缓存键和版本策略
|
||||
4. 确认 artifact 与 runtime 的责任边界
|
||||
|
||||
交付:
|
||||
|
||||
1. 一份固定缓存键规则
|
||||
2. 一份 D3D12 编译调用链清单
|
||||
|
||||
## Phase 1:让 artifact 真正带上 D3D12 编译产物
|
||||
|
||||
任务:
|
||||
|
||||
1. 为 D3D12 variant 生成并保存编译字节码
|
||||
2. artifact 写入与读取完整覆盖 `compiledBinary`
|
||||
3. 为编译产物附加必要版本信息
|
||||
|
||||
交付:
|
||||
|
||||
1. D3D12 shader artifact 中存在真实二进制 payload
|
||||
2. 可验证 artifact 前后字节码一致
|
||||
|
||||
## Phase 2:让 D3D12 runtime 优先吃缓存
|
||||
|
||||
任务:
|
||||
|
||||
1. 调整 `CompileD3D12Shader(...)` 调用链
|
||||
2. 优先从 `compiledBinary` 构造 shader bytecode
|
||||
3. 未命中时 fallback 到 `D3DCompile(...)`
|
||||
4. 命中/失效/回写日志打通
|
||||
|
||||
交付:
|
||||
|
||||
1. 二次打开 editor 时不再重新编译体积云 `MainPS`
|
||||
2. 其他 D3D12 shader 也具备相同缓存能力
|
||||
|
||||
## Phase 3:引入自动验证与冷启动对比
|
||||
|
||||
任务:
|
||||
|
||||
1. 加入 shader cache 命中测试
|
||||
2. 加入变体失效测试
|
||||
3. 对 editor 冷启动做两轮连续对比
|
||||
|
||||
目标数字:
|
||||
|
||||
1. 二次启动 `MainPS compile total_ms` 应接近 `0`
|
||||
2. `SceneReady` 应继续从当前 `~8.5s` 下探
|
||||
|
||||
---
|
||||
|
||||
## 7. 当前建议的提交边界
|
||||
|
||||
在新计划开始实施前,建议把已经完成的修复与后续 shader cache 工作拆成两批提交:
|
||||
|
||||
### 提交一:NanoVDB 路径修复与编译热点降压
|
||||
|
||||
应包含:
|
||||
|
||||
1. volume artifact / payload 路径修复
|
||||
2. `ResizeUninitialized` 与大 payload 默认构造开销修复
|
||||
3. `volumetric.shader` 去除高风险 `[unroll]`
|
||||
4. HLSL profile 对齐到 `5_1`
|
||||
5. 当前保留的计时日志
|
||||
|
||||
### 提交二:Shader 预编译缓存正式化
|
||||
|
||||
暂不在本提交里混入,避免把“已验证修复”和“下一阶段系统改造”搅在一起。
|
||||
|
||||
---
|
||||
|
||||
## 8. 验收标准
|
||||
|
||||
本计划完成时,至少满足:
|
||||
|
||||
1. D3D12 对同一 shader variant 的二次启动不再现场编译。
|
||||
2. 日志能明确证明缓存命中。
|
||||
3. 修改 shader 后缓存会正确失效。
|
||||
4. `NanoVDB` 体积云场景的二次启动 `SceneReady` 不再被 `MainPS` 编译拖慢。
|
||||
|
||||
485
docs/used/NanoVDB体积云加载阻塞与Runtime上传修复计划_完成归档_2026-04-10.md
Normal file
485
docs/used/NanoVDB体积云加载阻塞与Runtime上传修复计划_完成归档_2026-04-10.md
Normal file
@@ -0,0 +1,485 @@
|
||||
# NanoVDB 体积云加载阻塞与 Runtime 上传修复计划
|
||||
|
||||
文档日期:2026-04-10
|
||||
|
||||
适用范围:当前 `XCEngine` 的 `Resources / Asset / Rendering / RHI / Editor` 主线,目标问题为 editor 打开主场景后 `cloud.nvdb` 体积云加载时间长、首帧解锁后再次长时间卡死,而 `mvs/VolumeRenderer` 使用同一份 `cloud.nvdb` 仅约 3 秒即可完成加载与显示。
|
||||
|
||||
文档目标:把当前 editor 中 `NanoVDB` 体积云的加载链路,从“CPU 异步读完后在首个可见渲染帧同步创建并写入大体积 GPU 资源,导致主线程长时间阻塞”的错误运行模式,重构为接近 `mvs/VolumeRenderer` 的正确模式,即“CPU 异步读取 + GPU 本地 buffer 上传 + 上传完成前不阻塞编辑器交互 + 运行时不再把大体积 payload 留在 draw path 上处理”。
|
||||
|
||||
---
|
||||
|
||||
## 1. 问题结论
|
||||
|
||||
当前问题不是单点 bug,而是三段成本叠加:
|
||||
|
||||
1. `AssetDatabase` 对 `.nvdb` 的 `main.xcvol` artifact 只是“metadata + 原始 NanoVDB payload”包装,几乎没有降低运行时 payload 成本。
|
||||
2. `VolumeRendererComponent` 的 deferred scene load 只把 CPU 资源异步读到 `VolumeField`,没有提前完成 GPU 上传。
|
||||
3. `BuiltinVolumetricPass` 首次真正消费 `VolumeField` 时,`RenderResourceCache::UploadVolumeField()` 在渲染线程同步创建 `StorageBuffer` 并调用 `SetData()` 写入整个 payload,直接把首个可见帧变成一次大块同步上传。
|
||||
|
||||
最关键的问题在第 3 条。
|
||||
|
||||
当前 D3D12 后端里:
|
||||
|
||||
- `BufferType::Storage` 默认走 `UPLOAD heap`
|
||||
- `SetData()` 走 `Map + memcpy`
|
||||
- 于是体积云最终 shader 访问的不是 GPU 本地 `DEFAULT heap` buffer,而是 CPU 可写的 upload buffer
|
||||
|
||||
这与 `mvs/VolumeRenderer` 的链路根本不同。`mvs` 是:
|
||||
|
||||
1. 创建最终 `DEFAULT heap` buffer
|
||||
2. 创建临时 `UPLOAD heap` staging buffer
|
||||
3. CPU 只写 staging
|
||||
4. 通过 `CopyBufferRegion` 拷到默认堆
|
||||
5. 之后 shader 从 GPU 本地 buffer 读取
|
||||
|
||||
因此,第一阶段收益最大的修复,不是继续优化 artifact 文件,而是把 editor/runtime 的 volume buffer 上传路径改成和 `mvs` 同构。
|
||||
|
||||
---
|
||||
|
||||
## 2. 现状与根因拆解
|
||||
|
||||
## 2.1 当前 editor 链路
|
||||
|
||||
### 场景打开阶段
|
||||
|
||||
1. `SceneManager::LoadScene()` 在 deferred scene load 作用域中恢复场景结构。
|
||||
2. `VolumeRendererComponent` 仅恢复 `volumeRef -> path`,不立即同步 load。
|
||||
3. 直到渲染抽取阶段调用 `GetVolumeField()` 时,才触发 `LoadAsync()`。
|
||||
4. editor 状态栏中的 `Runtime streaming scene assets...` 只反映 CPU 侧异步资源请求计数。
|
||||
|
||||
### CPU 资源完成阶段
|
||||
|
||||
1. `LoadAsync()` 完成后,`VolumeField` 已在 CPU 内存中可用。
|
||||
2. 此时 editor 可以结束“streaming”状态,但 GPU 尚未完成 volume payload 驻留。
|
||||
3. 首个真正绘制体积云的 pass 进入 `BuiltinVolumetricPass::DrawVisibleVolume()`。
|
||||
4. `RenderResourceCache::GetOrCreateVolumeField()` 发现无缓存,触发 `UploadVolumeField()`。
|
||||
5. `UploadVolumeField()` 在渲染线程里同步分配 buffer / SRV,并写入整个 payload。
|
||||
|
||||
### 结果
|
||||
|
||||
1. 用户前面数秒可以交互,因为 CPU 异步读取没有阻塞主线程。
|
||||
2. 当 CPU 资源可用且首次 draw 发生时,主线程突然承担完整 GPU 上传。
|
||||
3. 由于 payload 约 `590 MB`,首个可见渲染帧被长时间卡死。
|
||||
|
||||
## 2.2 当前 artifact 链路的问题边界
|
||||
|
||||
`cloud.nvdb` 已经命中 `Library/Artifacts/.../main.xcvol`,因此问题不是“每次重导入”。
|
||||
|
||||
但当前 `xcvol` 也没有真正消除运行时成本:
|
||||
|
||||
1. 写 artifact 时直接写出 `VolumeField` payload。
|
||||
2. 读 artifact 时重新读入整个 payload 到 CPU 数组。
|
||||
3. 运行时依然需要再把整块 payload 上传到 GPU。
|
||||
|
||||
换言之,当前 artifact 的价值主要是:
|
||||
|
||||
- 导入结果稳定
|
||||
- metadata 结构化
|
||||
- 允许项目资产走统一 `AssetRef` / `Library` 流程
|
||||
|
||||
它还没有做到:
|
||||
|
||||
- 运行时零拷贝或近零拷贝装载
|
||||
- GPU 驻留态预烘焙
|
||||
- 直接针对 volume draw path 的运行时加速
|
||||
|
||||
## 2.3 与 `mvs/VolumeRenderer` 的本质差异
|
||||
|
||||
不是“都在传同一个 buffer,所以理论上应该一样快”,而是当前两者的最终资源模型不同:
|
||||
|
||||
### `mvs/VolumeRenderer`
|
||||
|
||||
- 最终资源:`DEFAULT heap` GPU 本地 buffer
|
||||
- 中转资源:临时 `UPLOAD heap`
|
||||
- 上传模式:copy queue / direct queue 提交拷贝后等待完成
|
||||
- draw path:只消费已上传完成的 GPU buffer
|
||||
|
||||
### 当前 editor
|
||||
|
||||
- 最终资源:`UPLOAD heap` buffer
|
||||
- 中转资源:无专门 staging
|
||||
- 上传模式:draw path 内同步 `Map + memcpy`
|
||||
- draw path:首次消费时同时承担资源上传职责
|
||||
|
||||
这意味着:
|
||||
|
||||
1. editor 首帧 draw path 的职责过重
|
||||
2. volume payload 的最终落点错误
|
||||
3. 即使 CPU 读取时间相近,GPU 上传和后续 shader 读取性能仍会明显落后于 `mvs`
|
||||
|
||||
---
|
||||
|
||||
## 3. 修复目标
|
||||
|
||||
本次修复分为三个层级目标。
|
||||
|
||||
## 3.1 一级目标:先把“8 秒后突然卡死十几秒”彻底打掉
|
||||
|
||||
要求:
|
||||
|
||||
1. 打开主场景后,editor 在 volume payload 首次可见前后都不出现长时间主线程冻结。
|
||||
2. `BuiltinVolumetricPass` 不再承担大体积同步上传职责。
|
||||
3. `StorageBuffer` 不再默认把 volume payload 留在 `UPLOAD heap` 作为最终运行时资源。
|
||||
|
||||
## 3.2 二级目标:让 editor 的 volume GPU 上传路径和 `mvs` 同构
|
||||
|
||||
要求:
|
||||
|
||||
1. D3D12 下 volume payload 最终驻留在 `DEFAULT heap`。
|
||||
2. CPU 只写 staging / upload 资源。
|
||||
3. GPU 通过 copy 提交完成真正拷贝。
|
||||
4. shader 后续只读取 GPU 本地资源。
|
||||
|
||||
## 3.3 三级目标:继续把总加载时间向 `mvs` 靠近
|
||||
|
||||
要求:
|
||||
|
||||
1. 逐步削减 `xcvol -> CPU payload` 的运行时装载成本。
|
||||
2. 未来允许 volume artifact 直接流向 GPU upload 路径,而不是“先完整常驻 CPU,再完整复制到 GPU”。
|
||||
3. 在保持引擎正式资源体系一致性的前提下,把总时间尽量压向 `mvs` 的约 3 秒基线。
|
||||
|
||||
---
|
||||
|
||||
## 4. 非目标
|
||||
|
||||
本轮不做以下内容,避免修复方向失焦:
|
||||
|
||||
1. 不重写 NanoVDB ray marching 算法本身。
|
||||
2. 不把正式主线路径退化回 `mvs/VolumeRenderer` 的孤立 sample 结构。
|
||||
3. 不先做 volume 压缩格式、体素裁剪重编码、分块稀疏 streaming。
|
||||
4. 不先重构完整 SRP / render graph。
|
||||
5. 不以“删除 Library 重建”作为修复方案。
|
||||
6. 不为体积云单独发明 editor 私有旁路渲染器。
|
||||
|
||||
---
|
||||
|
||||
## 5. 正式修复方向
|
||||
|
||||
## 5.1 方向一:补齐“不可变 GPU 本地 buffer + 初始数据上传”能力
|
||||
|
||||
这是本轮最高优先级。
|
||||
|
||||
### 当前缺口
|
||||
|
||||
当前 `RHIDevice::CreateBuffer(const BufferDesc&)` 只描述“创建一个 buffer”,但没有能力表达:
|
||||
|
||||
- 最终资源要落在 GPU 本地内存
|
||||
- 初始数据通过 staging copy 进入
|
||||
- 创建后立即进入某个最终状态
|
||||
|
||||
因此 `RenderResourceCache::UploadVolumeField()` 只能:
|
||||
|
||||
1. `CreateBuffer()`
|
||||
2. `CreateShaderResourceView()`
|
||||
3. `SetData()`
|
||||
|
||||
对 D3D12 volume 来说,这条路是错的。
|
||||
|
||||
### 目标能力
|
||||
|
||||
新增正式 buffer 创建能力,语义类似:
|
||||
|
||||
- `CreateBuffer(const BufferDesc& desc, const void* initialData, size_t initialDataSize, ResourceStates finalState)`
|
||||
|
||||
或
|
||||
|
||||
- `CreateInitializedBuffer(...)`
|
||||
|
||||
要求:
|
||||
|
||||
1. D3D12 volume storage buffer 走 `DEFAULT heap`
|
||||
2. 使用 upload staging 完成拷贝
|
||||
3. 拷贝后转为 `GenericRead` 或对应 shader 可读状态
|
||||
4. 返回对象仍然是统一 `RHIBuffer`
|
||||
|
||||
### 设计原则
|
||||
|
||||
1. 不要全局修改现有 `CreateBuffer(BufferType::Storage)` 的默认语义
|
||||
2. 仅给需要“设备本地 + 初始数据上传”的资源新增专用路径
|
||||
3. 保持旧代码依赖 `SetData()` 的场景继续可用
|
||||
|
||||
这一步是最大收益点,因为它同时解决:
|
||||
|
||||
1. 首帧主线程同步大 memcpy
|
||||
2. volume 最终资源落在 upload heap 的错误模型
|
||||
|
||||
## 5.2 方向二:把 volume GPU 上传从 draw path 前移
|
||||
|
||||
仅修正 D3D12 buffer 落点,还不够。
|
||||
|
||||
如果 volume 仍在 `BuiltinVolumetricPass::DrawVisibleVolume()` 首次执行时触发上传,那么:
|
||||
|
||||
1. 即使上传路径更正确
|
||||
2. 首个可见帧依旧要等待 GPU upload 完成
|
||||
3. editor 仍会出现明显顿挫
|
||||
|
||||
因此还需要把职责改成:
|
||||
|
||||
### 正确职责分层
|
||||
|
||||
#### CPU 异步资源层
|
||||
|
||||
- `VolumeRendererComponent` 异步拿到 `VolumeField`
|
||||
|
||||
#### GPU 上传调度层
|
||||
|
||||
- 检测到 `VolumeField` CPU 资源完成后,提交 GPU upload 请求
|
||||
- volume cache 进入 `Uploading` 状态
|
||||
|
||||
#### 渲染消费层
|
||||
|
||||
- `BuiltinVolumetricPass` 只消费 `GpuReady` 的 volume
|
||||
- 对尚未就绪的 volume 不绘制,不再临时上传
|
||||
|
||||
### 目标状态机
|
||||
|
||||
建议为 volume runtime cache 明确引入:
|
||||
|
||||
1. `Uninitialized`
|
||||
2. `CpuReady`
|
||||
3. `Uploading`
|
||||
4. `GpuReady`
|
||||
5. `Failed`
|
||||
|
||||
首帧 draw path 不再负责从 `Uninitialized/CpuReady` 直接推进到 `GpuReady`。
|
||||
|
||||
## 5.3 方向三:给 volume 增加正式上传队列或帧外预热入口
|
||||
|
||||
本轮至少需要一个最小正式机制,用于承接 GPU 上传工作。
|
||||
|
||||
### 最小可落地形式
|
||||
|
||||
1. 在渲染系统或 `RenderResourceCache` 外围增加 volume upload service
|
||||
2. 在主循环中轮询已完成的 CPU async load
|
||||
3. 将 volume GPU upload 提交给渲染设备层
|
||||
4. 上传完成后切换到 `GpuReady`
|
||||
|
||||
### 推荐正式方向
|
||||
|
||||
统一成“渲染资源上传服务”,后续 mesh / large texture 也可以逐步收口到这里。
|
||||
|
||||
本次 volume 修复可以先做 volume-only 版本,但接口命名不要把未来扩展堵死。
|
||||
|
||||
## 5.4 方向四:削减 `xcvol` 的运行时 CPU 装载成本
|
||||
|
||||
这是第二优先级的大项。
|
||||
|
||||
当前 artifact 仍然要求:
|
||||
|
||||
1. 打开文件
|
||||
2. 读 header
|
||||
3. 分配 payload 数组
|
||||
4. 把整个 payload 读入 CPU
|
||||
5. 再把整个 payload 上传到 GPU
|
||||
|
||||
要接近 `mvs` 的总时间,后面必须继续推进:
|
||||
|
||||
### 正式方向
|
||||
|
||||
1. volume artifact header 与 payload 更明确分段
|
||||
2. 支持 volume payload memory-mapping 或流式读入 upload buffer
|
||||
3. 非必要时不长期保留 `590 MB` CPU payload 常驻
|
||||
|
||||
### 本轮边界
|
||||
|
||||
本轮不要求一步做到 memory-mapped 零拷贝,但文档和接口设计必须为此留口。
|
||||
|
||||
---
|
||||
|
||||
## 6. 分阶段实施计划
|
||||
|
||||
## Phase 0:指标固化与基线采样
|
||||
|
||||
### 目标
|
||||
|
||||
在改代码前先量化基线,避免后续只凭体感判断。
|
||||
|
||||
### 任务
|
||||
|
||||
1. 记录当前 editor 打开 `Main.xc` 的三个时间点:
|
||||
- 场景结构恢复完成
|
||||
- `Runtime streaming scene assets...` 结束
|
||||
- volume 真正可见
|
||||
2. 记录当前 `mvs/VolumeRenderer` 从启动到 volume 可见耗时。
|
||||
3. 为 volume load / upload 补 focused trace,区分:
|
||||
- CPU artifact load
|
||||
- GPU upload begin
|
||||
- GPU upload complete
|
||||
4. 记录 D3D12 volume buffer 创建类型与 heap 类型。
|
||||
|
||||
### 交付
|
||||
|
||||
1. 一组可复现实测数字
|
||||
2. 一组可复现日志样本
|
||||
|
||||
## Phase 1:RHI 补齐 initialized GPU-local buffer 能力
|
||||
|
||||
### 目标
|
||||
|
||||
让 D3D12 volume storage buffer 走 `DEFAULT heap + staging copy`。
|
||||
|
||||
### 任务
|
||||
|
||||
1. 在 `RHIDevice` 层新增 initialized buffer 创建接口。
|
||||
2. D3D12 实现复用现有纹理上传思路,构建:
|
||||
- upload queue
|
||||
- upload allocator
|
||||
- upload command list
|
||||
- fence / idle wait
|
||||
3. volume 使用新接口,不再 `CreateBuffer + SetData()`。
|
||||
4. 保证 SRV 创建逻辑不变,shader 读取接口不变。
|
||||
|
||||
### 验收标准
|
||||
|
||||
1. D3D12 下 volume payload 最终 buffer 不再是 upload heap。
|
||||
2. `RenderResourceCache::UploadVolumeField()` 不再通过 `SetData()` 写入大 payload。
|
||||
3. 场景首次可见时的卡顿时长明显下降。
|
||||
|
||||
## Phase 2:把 GPU 上传从 draw path 中移除
|
||||
|
||||
### 目标
|
||||
|
||||
首个可见帧只消费 ready 资源,不执行大资源上传。
|
||||
|
||||
### 任务
|
||||
|
||||
1. 在 volume runtime cache 层引入上传状态。
|
||||
2. 在 CPU async load 完成后提交 GPU upload 请求。
|
||||
3. `BuiltinVolumetricPass` 遇到 `Uploading` 状态时跳过该 volume。
|
||||
4. editor 状态条可选择扩展为同时显示:
|
||||
- CPU streaming count
|
||||
- GPU upload pending count
|
||||
|
||||
### 验收标准
|
||||
|
||||
1. 打开场景后不再出现“streaming 结束后突然长时间卡死”。
|
||||
2. editor 在 volume GPU 上传期间依旧保持交互。
|
||||
3. volume 在 upload 完成后自然出现。
|
||||
|
||||
## Phase 3:降低 `xcvol` 运行时 CPU 成本
|
||||
|
||||
### 目标
|
||||
|
||||
继续逼近 `mvs` 的总时间,而不只是解决阻塞。
|
||||
|
||||
### 任务
|
||||
|
||||
1. 评估 `VolumeFieldLoader` 是否允许 payload 延迟所有权或映射式读取。
|
||||
2. 将 `xcvol` 的 metadata 和 payload 读取职责分层。
|
||||
3. 评估“直接读入 upload staging”路径,减少中间副本。
|
||||
|
||||
### 验收标准
|
||||
|
||||
1. 打开主场景后的总 volume 可见时间继续下降。
|
||||
2. CPU 峰值内存和中间复制次数下降。
|
||||
|
||||
## Phase 4:正式化验证与回归保护
|
||||
|
||||
### 目标
|
||||
|
||||
确保修复不会在后续 volume / mesh / texture 路径上回归。
|
||||
|
||||
### 任务
|
||||
|
||||
1. 增加 volume upload 相关单测或最小集成验证。
|
||||
2. 补 D3D12 路径日志断言或 profiling 钩子。
|
||||
3. 验证 Scene View / Game View / 运行时体积渲染路径一致。
|
||||
|
||||
### 验收标准
|
||||
|
||||
1. 主场景 volume 打开稳定
|
||||
2. `mvs` 与 editor 行为差距有明确量化解释
|
||||
3. 后续可以继续迭代到更强的 runtime streaming 架构
|
||||
|
||||
---
|
||||
|
||||
## 7. 涉及模块
|
||||
|
||||
本轮实现预计涉及以下模块:
|
||||
|
||||
### RHI
|
||||
|
||||
- `engine/include/XCEngine/RHI/RHIDevice.h`
|
||||
- `engine/src/RHI/D3D12/D3D12Device.cpp`
|
||||
- 必要时:
|
||||
- `engine/include/XCEngine/RHI/RHITypes.h`
|
||||
- `engine/include/XCEngine/RHI/RHIEnums.h`
|
||||
- `engine/include/XCEngine/RHI/D3D12/D3D12Buffer.h`
|
||||
- `engine/src/RHI/D3D12/D3D12Buffer.cpp`
|
||||
|
||||
### Rendering
|
||||
|
||||
- `engine/src/Rendering/Caches/RenderResourceCache.cpp`
|
||||
- `engine/include/XCEngine/Rendering/Caches/RenderResourceCache.h`
|
||||
- `engine/src/Rendering/Passes/BuiltinVolumetricPass.cpp`
|
||||
- 必要时新增 volume upload service 或相关状态结构
|
||||
|
||||
### Resources / Components
|
||||
|
||||
- `engine/src/Components/VolumeRendererComponent.cpp`
|
||||
- `engine/include/XCEngine/Components/VolumeRendererComponent.h`
|
||||
- `engine/src/Resources/Volume/VolumeFieldLoader.cpp`
|
||||
- `engine/include/XCEngine/Resources/Volume/VolumeField.h`
|
||||
|
||||
### Editor / Telemetry
|
||||
|
||||
- `editor/src/Managers/SceneManager.cpp`
|
||||
- `editor/src/Viewport/ViewportHostService.h`
|
||||
|
||||
### Tests
|
||||
|
||||
- `tests/Resources/Volume/`
|
||||
- `tests/Components/test_volume_renderer_component.cpp`
|
||||
- 必要时新增 rendering / integration 回归验证
|
||||
|
||||
---
|
||||
|
||||
## 8. 风险与约束
|
||||
|
||||
## 8.1 不能粗暴全局改 `StorageBuffer = DEFAULT heap`
|
||||
|
||||
原因:
|
||||
|
||||
1. 现有其他调用方可能默认依赖 `SetData()`
|
||||
2. 全局改语义会引入隐藏回归
|
||||
3. 本轮应以“新增 initialized immutable path”为主
|
||||
|
||||
## 8.2 首先只确保 D3D12 主线正确
|
||||
|
||||
当前用户问题发生在 editor D3D12 主线,因此:
|
||||
|
||||
1. 本轮优先保证 D3D12 正确和快
|
||||
2. Vulkan / OpenGL 保持兼容,不要求一步做到同等级优化
|
||||
3. 但接口设计不要阻断后续跨后端统一
|
||||
|
||||
## 8.3 不能让 draw path 同时承担恢复和上传职责
|
||||
|
||||
这条是硬约束。
|
||||
|
||||
只要 draw path 里仍然存在“第一次看到资源就同步上传 590MB”这件事,问题就没有从根上解决。
|
||||
|
||||
---
|
||||
|
||||
## 9. 完成标准
|
||||
|
||||
认为本轮修复完成,至少要同时满足以下条件:
|
||||
|
||||
1. editor 打开 `Main.xc` 时,`Runtime streaming scene assets...` 结束后不再出现十几秒主线程卡死。
|
||||
2. `cloud.nvdb` 的 volume payload 在 D3D12 下最终驻留于 GPU 本地 buffer,而不是 upload heap。
|
||||
3. `BuiltinVolumetricPass` 不再在首次 draw 时同步执行大体积上传。
|
||||
4. editor 的 volume 可见总时间相比当前主线显著下降。
|
||||
5. 修复后的行为可以用日志和代码路径明确解释,而不是只靠体感判断“快了”。
|
||||
|
||||
---
|
||||
|
||||
## 10. 本轮执行顺序
|
||||
|
||||
严格按以下顺序推进:
|
||||
|
||||
1. 固化指标与日志
|
||||
2. 补 RHI initialized GPU-local buffer 能力
|
||||
3. volume 改走新上传路径
|
||||
4. 把 GPU upload 从 draw path 前移
|
||||
5. 再评估 `xcvol` 运行时 CPU 装载优化
|
||||
|
||||
不跳步骤,不同时展开多个大方向,先把最大收益点打掉。
|
||||
@@ -171,7 +171,6 @@ Shader "Builtin Volumetric"
|
||||
float shadow = 1.0f;
|
||||
float stepSize = 1.0f;
|
||||
|
||||
[unroll]
|
||||
for (int stepIndex = 0; stepIndex < 10; ++stepIndex) {
|
||||
const float3 samplePosition = localPosition + stepSize * localLightDirection;
|
||||
const float sigmaS =
|
||||
|
||||
@@ -4,6 +4,7 @@
|
||||
#include <initializer_list>
|
||||
#include <algorithm>
|
||||
#include <stdexcept>
|
||||
#include <type_traits>
|
||||
#include <XCEngine/Memory/Allocator.h>
|
||||
|
||||
namespace XCEngine {
|
||||
@@ -41,6 +42,7 @@ public:
|
||||
void Reserve(size_t capacity);
|
||||
void Resize(size_t newSize);
|
||||
void Resize(size_t newSize, const T& value);
|
||||
void ResizeUninitialized(size_t newSize);
|
||||
|
||||
void PushBack(const T& value);
|
||||
void PushBack(T&& value);
|
||||
@@ -199,6 +201,22 @@ void Array<T>::Resize(size_t newSize) {
|
||||
m_size = newSize;
|
||||
}
|
||||
|
||||
template<typename T>
|
||||
void Array<T>::ResizeUninitialized(size_t newSize) {
|
||||
static_assert(
|
||||
std::is_trivially_default_constructible_v<T>,
|
||||
"ResizeUninitialized requires trivially default-constructible elements");
|
||||
static_assert(
|
||||
std::is_trivially_destructible_v<T>,
|
||||
"ResizeUninitialized requires trivially destructible elements");
|
||||
|
||||
if (newSize > m_capacity) {
|
||||
Reallocate(newSize);
|
||||
}
|
||||
|
||||
m_size = newSize;
|
||||
}
|
||||
|
||||
template<typename T>
|
||||
void Array<T>::Resize(size_t newSize, const T& value) {
|
||||
if (newSize > m_capacity) {
|
||||
|
||||
@@ -64,6 +64,7 @@ public:
|
||||
bool IsDeviceRemoved() const { return m_isDeviceRemoved; }
|
||||
|
||||
RHIBuffer* CreateBuffer(const BufferDesc& desc) override;
|
||||
RHIBuffer* CreateBuffer(const BufferDesc& desc, const void* initialData, size_t initialDataSize, ResourceStates finalState = ResourceStates::GenericRead) override;
|
||||
RHITexture* CreateTexture(const TextureDesc& desc) override;
|
||||
RHITexture* CreateTexture(const TextureDesc& desc, const void* initialData, size_t initialDataSize, uint32_t rowPitch = 0) override;
|
||||
RHISwapChain* CreateSwapChain(const SwapChainDesc& desc, RHICommandQueue* presentQueue) override;
|
||||
|
||||
@@ -2,6 +2,7 @@
|
||||
|
||||
#include "RHITypes.h"
|
||||
#include "RHICapabilities.h"
|
||||
#include "RHIBuffer.h"
|
||||
#include "RHIDescriptorPool.h"
|
||||
#include "RHIDescriptorSet.h"
|
||||
#include "RHIRenderPass.h"
|
||||
@@ -35,6 +36,40 @@ public:
|
||||
virtual void Shutdown() = 0;
|
||||
|
||||
virtual RHIBuffer* CreateBuffer(const BufferDesc& desc) = 0;
|
||||
virtual RHIBuffer* CreateBuffer(
|
||||
const BufferDesc& desc,
|
||||
const void* initialData,
|
||||
size_t initialDataSize,
|
||||
ResourceStates finalState = ResourceStates::GenericRead) {
|
||||
if (initialData == nullptr || initialDataSize == 0u) {
|
||||
return CreateBuffer(desc);
|
||||
}
|
||||
if (initialDataSize > desc.size) {
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
RHIBuffer* buffer = CreateBuffer(desc);
|
||||
if (buffer == nullptr) {
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
buffer->SetData(initialData, initialDataSize);
|
||||
if (desc.size > static_cast<uint64_t>(initialDataSize)) {
|
||||
const std::array<unsigned char, 256> zeroBytes = {};
|
||||
uint64_t remainingBytes = desc.size - static_cast<uint64_t>(initialDataSize);
|
||||
size_t writeOffset = initialDataSize;
|
||||
while (remainingBytes > 0u) {
|
||||
const size_t chunkSize = static_cast<size_t>(
|
||||
remainingBytes > zeroBytes.size() ? zeroBytes.size() : remainingBytes);
|
||||
buffer->SetData(zeroBytes.data(), chunkSize, writeOffset);
|
||||
remainingBytes -= static_cast<uint64_t>(chunkSize);
|
||||
writeOffset += chunkSize;
|
||||
}
|
||||
}
|
||||
|
||||
buffer->SetState(finalState);
|
||||
return buffer;
|
||||
}
|
||||
virtual RHITexture* CreateTexture(const TextureDesc& desc) = 0;
|
||||
virtual RHITexture* CreateTexture(const TextureDesc& desc, const void* initialData, size_t initialDataSize, uint32_t rowPitch = 0) = 0;
|
||||
virtual RHISwapChain* CreateSwapChain(const SwapChainDesc& desc, RHICommandQueue* presentQueue) = 0;
|
||||
|
||||
@@ -57,6 +57,13 @@ public:
|
||||
const VolumeIndexBounds& indexBounds = VolumeIndexBounds(),
|
||||
Core::uint32 gridType = 0u,
|
||||
Core::uint32 gridClass = 0u);
|
||||
bool CreateOwned(VolumeStorageKind storageKind,
|
||||
Containers::Array<Core::uint8>&& payload,
|
||||
const Math::Bounds& bounds = Math::Bounds(),
|
||||
const Math::Vector3& voxelSize = Math::Vector3::Zero(),
|
||||
const VolumeIndexBounds& indexBounds = VolumeIndexBounds(),
|
||||
Core::uint32 gridType = 0u,
|
||||
Core::uint32 gridClass = 0u);
|
||||
|
||||
VolumeStorageKind GetStorageKind() const { return m_storageKind; }
|
||||
const Math::Bounds& GetBounds() const { return m_bounds; }
|
||||
@@ -69,6 +76,12 @@ public:
|
||||
size_t GetPayloadSize() const { return m_payload.Size(); }
|
||||
|
||||
private:
|
||||
bool ApplyMetadata(VolumeStorageKind storageKind,
|
||||
const Math::Bounds& bounds,
|
||||
const Math::Vector3& voxelSize,
|
||||
const VolumeIndexBounds& indexBounds,
|
||||
Core::uint32 gridType,
|
||||
Core::uint32 gridClass);
|
||||
void UpdateMemorySize();
|
||||
|
||||
VolumeStorageKind m_storageKind = VolumeStorageKind::Unknown;
|
||||
|
||||
@@ -17,7 +17,10 @@
|
||||
#include "XCEngine/RHI/D3D12/D3D12ResourceView.h"
|
||||
#include "XCEngine/RHI/D3D12/D3D12RenderPass.h"
|
||||
#include "XCEngine/RHI/D3D12/D3D12Framebuffer.h"
|
||||
#include "XCEngine/Debug/Logger.h"
|
||||
#include <algorithm>
|
||||
#include <chrono>
|
||||
#include <cstring>
|
||||
#include <stdio.h>
|
||||
#include <memory>
|
||||
#include <string>
|
||||
@@ -41,11 +44,51 @@ std::string NarrowAscii(const std::wstring& value) {
|
||||
return result;
|
||||
}
|
||||
|
||||
uint64_t GetVolumeTraceSteadyMs();
|
||||
void LogVolumeTraceRendering(const std::string& message);
|
||||
|
||||
bool HasShaderPayload(const ShaderCompileDesc& desc) {
|
||||
return !desc.source.empty() || !desc.fileName.empty();
|
||||
}
|
||||
|
||||
bool ShouldTraceVolumetricShaderCompile(const ShaderCompileDesc& desc) {
|
||||
const std::string fileName = NarrowAscii(desc.fileName);
|
||||
if (fileName.find("volumetric") != std::string::npos) {
|
||||
return true;
|
||||
}
|
||||
|
||||
if (!desc.source.empty()) {
|
||||
const std::string sourceText(desc.source.begin(), desc.source.end());
|
||||
if (sourceText.find("PNANOVDB_HLSL") != std::string::npos ||
|
||||
sourceText.find("VolumeData") != std::string::npos) {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
std::string DescribeShaderCompileDesc(const ShaderCompileDesc& desc) {
|
||||
std::string description =
|
||||
"entry=" + NarrowAscii(desc.entryPoint) +
|
||||
" profile=" + NarrowAscii(desc.profile) +
|
||||
" source_bytes=" + std::to_string(desc.source.size()) +
|
||||
" macro_count=" + std::to_string(desc.macros.size());
|
||||
if (!desc.fileName.empty()) {
|
||||
description += " file=" + NarrowAscii(desc.fileName);
|
||||
}
|
||||
return description;
|
||||
}
|
||||
|
||||
bool CompileD3D12Shader(const ShaderCompileDesc& desc, D3D12Shader& shader) {
|
||||
const bool traceShaderCompile = ShouldTraceVolumetricShaderCompile(desc);
|
||||
const uint64_t compileStartMs = traceShaderCompile ? GetVolumeTraceSteadyMs() : 0u;
|
||||
if (traceShaderCompile) {
|
||||
LogVolumeTraceRendering(
|
||||
"D3D12 shader compile begin steady_ms=" + std::to_string(compileStartMs) + " " +
|
||||
DescribeShaderCompileDesc(desc));
|
||||
}
|
||||
|
||||
const std::string entryPoint = NarrowAscii(desc.entryPoint);
|
||||
const std::string profile = NarrowAscii(desc.profile);
|
||||
const char* entryPointPtr = entryPoint.empty() ? nullptr : entryPoint.c_str();
|
||||
@@ -76,19 +119,44 @@ bool CompileD3D12Shader(const ShaderCompileDesc& desc, D3D12Shader& shader) {
|
||||
}
|
||||
|
||||
const D3D_SHADER_MACRO* macroPtr = macroTable.empty() ? nullptr : macroTable.data();
|
||||
return shader.Compile(
|
||||
const bool compiled = shader.Compile(
|
||||
desc.source.data(),
|
||||
desc.source.size(),
|
||||
desc.fileName.empty() ? nullptr : desc.fileName.c_str(),
|
||||
macroPtr,
|
||||
entryPointPtr,
|
||||
profilePtr);
|
||||
if (traceShaderCompile) {
|
||||
const uint64_t compileEndMs = GetVolumeTraceSteadyMs();
|
||||
LogVolumeTraceRendering(
|
||||
std::string("D3D12 shader compile ") + (compiled ? "end" : "failed") +
|
||||
" steady_ms=" + std::to_string(compileEndMs) +
|
||||
" total_ms=" + std::to_string(compileEndMs - compileStartMs) + " " +
|
||||
DescribeShaderCompileDesc(desc));
|
||||
}
|
||||
return compiled;
|
||||
}
|
||||
|
||||
if (!desc.fileName.empty()) {
|
||||
return shader.CompileFromFile(desc.fileName.c_str(), entryPointPtr, profilePtr);
|
||||
const bool compiled = shader.CompileFromFile(desc.fileName.c_str(), entryPointPtr, profilePtr);
|
||||
if (traceShaderCompile) {
|
||||
const uint64_t compileEndMs = GetVolumeTraceSteadyMs();
|
||||
LogVolumeTraceRendering(
|
||||
std::string("D3D12 shader compile ") + (compiled ? "end" : "failed") +
|
||||
" steady_ms=" + std::to_string(compileEndMs) +
|
||||
" total_ms=" + std::to_string(compileEndMs - compileStartMs) + " " +
|
||||
DescribeShaderCompileDesc(desc));
|
||||
}
|
||||
return compiled;
|
||||
}
|
||||
|
||||
if (traceShaderCompile) {
|
||||
const uint64_t compileEndMs = GetVolumeTraceSteadyMs();
|
||||
LogVolumeTraceRendering(
|
||||
"D3D12 shader compile failed steady_ms=" + std::to_string(compileEndMs) +
|
||||
" total_ms=" + std::to_string(compileEndMs - compileStartMs) +
|
||||
" reason=empty_shader_payload " + DescribeShaderCompileDesc(desc));
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
@@ -318,6 +386,24 @@ bool IsSupportedBufferViewDimension(ResourceViewDimension dimension) {
|
||||
dimension == ResourceViewDimension::RawBuffer;
|
||||
}
|
||||
|
||||
uint64_t GetVolumeTraceSteadyMs() {
|
||||
using Clock = std::chrono::steady_clock;
|
||||
static const Clock::time_point s_start = Clock::now();
|
||||
return static_cast<uint64_t>(std::chrono::duration_cast<std::chrono::milliseconds>(
|
||||
Clock::now() - s_start).count());
|
||||
}
|
||||
|
||||
bool ShouldTraceLargeStorageBuffer(const BufferDesc& desc) {
|
||||
return static_cast<BufferType>(desc.bufferType) == BufferType::Storage &&
|
||||
desc.size >= 32ull * 1024ull * 1024ull;
|
||||
}
|
||||
|
||||
void LogVolumeTraceRendering(const std::string& message) {
|
||||
Containers::String entry("[VolumeTrace] ");
|
||||
entry += message.c_str();
|
||||
Debug::Logger::Get().Info(Debug::LogCategory::Rendering, entry);
|
||||
}
|
||||
|
||||
uint32_t ResolveBufferViewElementStride(RHIBuffer* buffer, const ResourceViewDesc& desc) {
|
||||
if (desc.dimension == ResourceViewDimension::RawBuffer) {
|
||||
return 4u;
|
||||
@@ -796,6 +882,13 @@ RHIBuffer* D3D12Device::CreateBuffer(const BufferDesc& desc) {
|
||||
}
|
||||
}
|
||||
|
||||
if (ShouldTraceLargeStorageBuffer(desc)) {
|
||||
LogVolumeTraceRendering(
|
||||
"D3D12 CreateBuffer legacy size_bytes=" + std::to_string(desc.size) +
|
||||
" heap=" + std::to_string(static_cast<int>(heapType)) +
|
||||
" flags=" + std::to_string(static_cast<unsigned long long>(desc.flags)));
|
||||
}
|
||||
|
||||
if (buffer->Initialize(m_device.Get(), desc.size, initialState, heapType, resourceFlags)) {
|
||||
buffer->SetStride(desc.stride);
|
||||
buffer->SetBufferType(bufferType);
|
||||
@@ -806,6 +899,209 @@ RHIBuffer* D3D12Device::CreateBuffer(const BufferDesc& desc) {
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
RHIBuffer* D3D12Device::CreateBuffer(
|
||||
const BufferDesc& desc,
|
||||
const void* initialData,
|
||||
size_t initialDataSize,
|
||||
ResourceStates finalState) {
|
||||
const bool traceLargeStorageBuffer = ShouldTraceLargeStorageBuffer(desc);
|
||||
const uint64_t traceStartMs = traceLargeStorageBuffer ? GetVolumeTraceSteadyMs() : 0u;
|
||||
if (traceLargeStorageBuffer) {
|
||||
LogVolumeTraceRendering(
|
||||
"D3D12 CreateBuffer(initialData) begin steady_ms=" + std::to_string(traceStartMs) +
|
||||
" size_bytes=" + std::to_string(desc.size) +
|
||||
" initial_bytes=" + std::to_string(initialDataSize));
|
||||
}
|
||||
|
||||
if (initialData == nullptr || initialDataSize == 0u) {
|
||||
return CreateBuffer(desc);
|
||||
}
|
||||
if (m_device == nullptr ||
|
||||
desc.size == 0u ||
|
||||
initialDataSize > desc.size ||
|
||||
desc.size > static_cast<uint64_t>(SIZE_MAX)) {
|
||||
if (traceLargeStorageBuffer) {
|
||||
LogVolumeTraceRendering("D3D12 CreateBuffer(initialData) rejected invalid parameters");
|
||||
}
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
const BufferType bufferType = static_cast<BufferType>(desc.bufferType);
|
||||
if (bufferType == BufferType::ReadBack) {
|
||||
if (traceLargeStorageBuffer) {
|
||||
LogVolumeTraceRendering("D3D12 CreateBuffer(initialData) rejected readback buffer");
|
||||
}
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
const BufferFlags bufferFlags = static_cast<BufferFlags>(desc.flags);
|
||||
D3D12_RESOURCE_FLAGS resourceFlags = D3D12_RESOURCE_FLAG_NONE;
|
||||
if ((bufferFlags & BufferFlags::AllowUnorderedAccess) == BufferFlags::AllowUnorderedAccess) {
|
||||
resourceFlags = D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS;
|
||||
}
|
||||
|
||||
D3D12CommandQueue uploadQueue;
|
||||
if (!uploadQueue.Initialize(m_device.Get(), CommandQueueType::Direct)) {
|
||||
if (traceLargeStorageBuffer) {
|
||||
LogVolumeTraceRendering("D3D12 CreateBuffer(initialData) failed stage=init_queue");
|
||||
}
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
D3D12CommandAllocator uploadAllocator;
|
||||
if (!uploadAllocator.Initialize(m_device.Get(), CommandQueueType::Direct)) {
|
||||
if (traceLargeStorageBuffer) {
|
||||
LogVolumeTraceRendering("D3D12 CreateBuffer(initialData) failed stage=init_allocator");
|
||||
}
|
||||
uploadQueue.Shutdown();
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
D3D12CommandList uploadCommandList;
|
||||
if (!uploadCommandList.Initialize(m_device.Get(), CommandQueueType::Direct, uploadAllocator.GetCommandAllocator())) {
|
||||
if (traceLargeStorageBuffer) {
|
||||
LogVolumeTraceRendering("D3D12 CreateBuffer(initialData) failed stage=init_command_list");
|
||||
}
|
||||
uploadAllocator.Shutdown();
|
||||
uploadQueue.Shutdown();
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
auto shutdownUploadContext = [&]() {
|
||||
uploadCommandList.Shutdown();
|
||||
uploadAllocator.Shutdown();
|
||||
uploadQueue.Shutdown();
|
||||
};
|
||||
|
||||
uploadAllocator.Reset();
|
||||
uploadCommandList.Reset();
|
||||
const uint64_t commandSetupEndMs = traceLargeStorageBuffer ? GetVolumeTraceSteadyMs() : 0u;
|
||||
|
||||
auto* buffer = new D3D12Buffer();
|
||||
if (!buffer->Initialize(
|
||||
m_device.Get(),
|
||||
desc.size,
|
||||
D3D12_RESOURCE_STATE_COPY_DEST,
|
||||
D3D12_HEAP_TYPE_DEFAULT,
|
||||
resourceFlags)) {
|
||||
if (traceLargeStorageBuffer) {
|
||||
LogVolumeTraceRendering("D3D12 CreateBuffer(initialData) failed stage=create_default_buffer");
|
||||
}
|
||||
delete buffer;
|
||||
shutdownUploadContext();
|
||||
return nullptr;
|
||||
}
|
||||
const uint64_t defaultBufferEndMs = traceLargeStorageBuffer ? GetVolumeTraceSteadyMs() : 0u;
|
||||
|
||||
buffer->SetStride(desc.stride);
|
||||
buffer->SetBufferType(bufferType);
|
||||
buffer->SetState(ResourceStates::CopyDst);
|
||||
|
||||
D3D12_HEAP_PROPERTIES uploadHeapProperties = {};
|
||||
uploadHeapProperties.Type = D3D12_HEAP_TYPE_UPLOAD;
|
||||
uploadHeapProperties.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN;
|
||||
uploadHeapProperties.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN;
|
||||
uploadHeapProperties.CreationNodeMask = 0;
|
||||
uploadHeapProperties.VisibleNodeMask = 0;
|
||||
|
||||
D3D12_RESOURCE_DESC uploadBufferDesc = {};
|
||||
uploadBufferDesc.Dimension = D3D12_RESOURCE_DIMENSION_BUFFER;
|
||||
uploadBufferDesc.Alignment = 0;
|
||||
uploadBufferDesc.Width = desc.size;
|
||||
uploadBufferDesc.Height = 1;
|
||||
uploadBufferDesc.DepthOrArraySize = 1;
|
||||
uploadBufferDesc.MipLevels = 1;
|
||||
uploadBufferDesc.Format = DXGI_FORMAT_UNKNOWN;
|
||||
uploadBufferDesc.SampleDesc.Count = 1;
|
||||
uploadBufferDesc.SampleDesc.Quality = 0;
|
||||
uploadBufferDesc.Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR;
|
||||
uploadBufferDesc.Flags = D3D12_RESOURCE_FLAG_NONE;
|
||||
|
||||
ComPtr<ID3D12Resource> uploadBuffer;
|
||||
if (FAILED(m_device->CreateCommittedResource(
|
||||
&uploadHeapProperties,
|
||||
D3D12_HEAP_FLAG_NONE,
|
||||
&uploadBufferDesc,
|
||||
D3D12_RESOURCE_STATE_GENERIC_READ,
|
||||
nullptr,
|
||||
IID_PPV_ARGS(&uploadBuffer)))) {
|
||||
if (traceLargeStorageBuffer) {
|
||||
LogVolumeTraceRendering("D3D12 CreateBuffer(initialData) failed stage=create_upload_buffer");
|
||||
}
|
||||
buffer->Shutdown();
|
||||
delete buffer;
|
||||
shutdownUploadContext();
|
||||
return nullptr;
|
||||
}
|
||||
const uint64_t uploadBufferEndMs = traceLargeStorageBuffer ? GetVolumeTraceSteadyMs() : 0u;
|
||||
|
||||
void* mappedData = nullptr;
|
||||
D3D12_RANGE readRange = { 0, 0 };
|
||||
if (FAILED(uploadBuffer->Map(0, &readRange, &mappedData)) || mappedData == nullptr) {
|
||||
if (traceLargeStorageBuffer) {
|
||||
LogVolumeTraceRendering("D3D12 CreateBuffer(initialData) failed stage=map_upload_buffer");
|
||||
}
|
||||
buffer->Shutdown();
|
||||
delete buffer;
|
||||
shutdownUploadContext();
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
std::memset(mappedData, 0, static_cast<size_t>(desc.size));
|
||||
std::memcpy(mappedData, initialData, initialDataSize);
|
||||
uploadBuffer->Unmap(0, nullptr);
|
||||
const uint64_t cpuFillEndMs = traceLargeStorageBuffer ? GetVolumeTraceSteadyMs() : 0u;
|
||||
|
||||
ID3D12GraphicsCommandList* const commandList = uploadCommandList.GetCommandList();
|
||||
commandList->CopyBufferRegion(
|
||||
buffer->GetResource(),
|
||||
0,
|
||||
uploadBuffer.Get(),
|
||||
0,
|
||||
desc.size);
|
||||
|
||||
const D3D12_RESOURCE_STATES resolvedFinalState = ToD3D12(finalState);
|
||||
if (resolvedFinalState != D3D12_RESOURCE_STATE_COPY_DEST) {
|
||||
D3D12_RESOURCE_BARRIER barrier = {};
|
||||
barrier.Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
|
||||
barrier.Flags = D3D12_RESOURCE_BARRIER_FLAG_NONE;
|
||||
barrier.Transition.pResource = buffer->GetResource();
|
||||
barrier.Transition.StateBefore = D3D12_RESOURCE_STATE_COPY_DEST;
|
||||
barrier.Transition.StateAfter = resolvedFinalState;
|
||||
barrier.Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
|
||||
commandList->ResourceBarrier(1, &barrier);
|
||||
}
|
||||
const uint64_t recordCommandsEndMs = traceLargeStorageBuffer ? GetVolumeTraceSteadyMs() : 0u;
|
||||
|
||||
uploadCommandList.Close();
|
||||
ID3D12CommandList* commandLists[] = { commandList };
|
||||
uploadQueue.ExecuteCommandListsInternal(1, commandLists);
|
||||
const uint64_t submitEndMs = traceLargeStorageBuffer ? GetVolumeTraceSteadyMs() : 0u;
|
||||
if (traceLargeStorageBuffer) {
|
||||
LogVolumeTraceRendering(
|
||||
"D3D12 CreateBuffer(initialData) waiting_for_idle steady_ms=" + std::to_string(submitEndMs) +
|
||||
" size_bytes=" + std::to_string(desc.size));
|
||||
}
|
||||
uploadQueue.WaitForIdle();
|
||||
const uint64_t waitEndMs = traceLargeStorageBuffer ? GetVolumeTraceSteadyMs() : 0u;
|
||||
|
||||
buffer->SetState(finalState);
|
||||
if (traceLargeStorageBuffer) {
|
||||
LogVolumeTraceRendering(
|
||||
"D3D12 CreateBuffer(initialData) end steady_ms=" + std::to_string(waitEndMs) +
|
||||
" total_ms=" + std::to_string(waitEndMs - traceStartMs) +
|
||||
" setup_ms=" + std::to_string(commandSetupEndMs - traceStartMs) +
|
||||
" default_buffer_ms=" + std::to_string(defaultBufferEndMs - commandSetupEndMs) +
|
||||
" upload_buffer_ms=" + std::to_string(uploadBufferEndMs - defaultBufferEndMs) +
|
||||
" cpu_fill_ms=" + std::to_string(cpuFillEndMs - uploadBufferEndMs) +
|
||||
" record_ms=" + std::to_string(recordCommandsEndMs - cpuFillEndMs) +
|
||||
" submit_ms=" + std::to_string(submitEndMs - recordCommandsEndMs) +
|
||||
" wait_ms=" + std::to_string(waitEndMs - submitEndMs));
|
||||
}
|
||||
shutdownUploadContext();
|
||||
return buffer;
|
||||
}
|
||||
|
||||
RHITexture* D3D12Device::CreateTexture(const TextureDesc& desc) {
|
||||
auto* texture = new D3D12Texture();
|
||||
D3D12_RESOURCE_DESC d3d12Desc = {};
|
||||
@@ -1112,6 +1408,19 @@ RHICommandQueue* D3D12Device::CreateCommandQueue(const CommandQueueDesc& desc) {
|
||||
}
|
||||
|
||||
RHIPipelineState* D3D12Device::CreatePipelineState(const GraphicsPipelineDesc& desc) {
|
||||
const bool traceVolumetricPipeline =
|
||||
ShouldTraceVolumetricShaderCompile(desc.vertexShader) ||
|
||||
ShouldTraceVolumetricShaderCompile(desc.fragmentShader) ||
|
||||
ShouldTraceVolumetricShaderCompile(desc.geometryShader);
|
||||
const uint64_t pipelineStartMs = traceVolumetricPipeline ? GetVolumeTraceSteadyMs() : 0u;
|
||||
if (traceVolumetricPipeline) {
|
||||
LogVolumeTraceRendering(
|
||||
"D3D12 CreatePipelineState begin steady_ms=" + std::to_string(pipelineStartMs) +
|
||||
" has_vs=" + std::to_string(HasShaderPayload(desc.vertexShader) ? 1 : 0) +
|
||||
" has_ps=" + std::to_string(HasShaderPayload(desc.fragmentShader) ? 1 : 0) +
|
||||
" has_gs=" + std::to_string(HasShaderPayload(desc.geometryShader) ? 1 : 0));
|
||||
}
|
||||
|
||||
auto* pso = new D3D12PipelineState(m_device.Get());
|
||||
pso->SetInputLayout(desc.inputLayout);
|
||||
pso->SetRasterizerState(desc.rasterizerState);
|
||||
@@ -1120,6 +1429,7 @@ RHIPipelineState* D3D12Device::CreatePipelineState(const GraphicsPipelineDesc& d
|
||||
pso->SetTopology(desc.topologyType);
|
||||
pso->SetRenderTargetFormats(desc.renderTargetCount, desc.renderTargetFormats, desc.depthStencilFormat);
|
||||
pso->SetSampleCount(desc.sampleCount);
|
||||
pso->SetSampleQuality(desc.sampleQuality);
|
||||
|
||||
const bool hasVertexShader = HasShaderPayload(desc.vertexShader);
|
||||
const bool hasFragmentShader = HasShaderPayload(desc.fragmentShader);
|
||||
@@ -1154,6 +1464,15 @@ RHIPipelineState* D3D12Device::CreatePipelineState(const GraphicsPipelineDesc& d
|
||||
const bool geometryCompiled = !hasGeometryShader || CompileD3D12Shader(desc.geometryShader, geometryShader);
|
||||
|
||||
if (!vertexCompiled || !fragmentCompiled || !geometryCompiled) {
|
||||
if (traceVolumetricPipeline) {
|
||||
const uint64_t failureMs = GetVolumeTraceSteadyMs();
|
||||
LogVolumeTraceRendering(
|
||||
"D3D12 CreatePipelineState failed steady_ms=" + std::to_string(failureMs) +
|
||||
" total_ms=" + std::to_string(failureMs - pipelineStartMs) +
|
||||
" vertex_ok=" + std::to_string(vertexCompiled ? 1 : 0) +
|
||||
" fragment_ok=" + std::to_string(fragmentCompiled ? 1 : 0) +
|
||||
" geometry_ok=" + std::to_string(geometryCompiled ? 1 : 0));
|
||||
}
|
||||
if (rootSignature != nullptr) {
|
||||
rootSignature->Shutdown();
|
||||
delete rootSignature;
|
||||
@@ -1166,7 +1485,19 @@ RHIPipelineState* D3D12Device::CreatePipelineState(const GraphicsPipelineDesc& d
|
||||
vertexShader.GetD3D12Bytecode(),
|
||||
fragmentShader.GetD3D12Bytecode(),
|
||||
hasGeometryShader ? geometryShader.GetD3D12Bytecode() : D3D12_SHADER_BYTECODE{});
|
||||
const uint64_t finalizeStartMs = traceVolumetricPipeline ? GetVolumeTraceSteadyMs() : 0u;
|
||||
if (traceVolumetricPipeline) {
|
||||
LogVolumeTraceRendering(
|
||||
"D3D12 CreatePipelineState finalize begin steady_ms=" + std::to_string(finalizeStartMs));
|
||||
}
|
||||
pso->EnsureValid();
|
||||
if (traceVolumetricPipeline) {
|
||||
const uint64_t finalizeEndMs = GetVolumeTraceSteadyMs();
|
||||
LogVolumeTraceRendering(
|
||||
"D3D12 CreatePipelineState finalize end steady_ms=" + std::to_string(finalizeEndMs) +
|
||||
" total_ms=" + std::to_string(finalizeEndMs - finalizeStartMs) +
|
||||
" valid=" + std::to_string(pso->IsValid() ? 1 : 0));
|
||||
}
|
||||
|
||||
if (rootSignature != nullptr) {
|
||||
rootSignature->Shutdown();
|
||||
@@ -1174,10 +1505,22 @@ RHIPipelineState* D3D12Device::CreatePipelineState(const GraphicsPipelineDesc& d
|
||||
}
|
||||
|
||||
if (!pso->IsValid()) {
|
||||
if (traceVolumetricPipeline) {
|
||||
const uint64_t failureMs = GetVolumeTraceSteadyMs();
|
||||
LogVolumeTraceRendering(
|
||||
"D3D12 CreatePipelineState invalid steady_ms=" + std::to_string(failureMs) +
|
||||
" total_ms=" + std::to_string(failureMs - pipelineStartMs));
|
||||
}
|
||||
delete pso;
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
if (traceVolumetricPipeline) {
|
||||
const uint64_t pipelineEndMs = GetVolumeTraceSteadyMs();
|
||||
LogVolumeTraceRendering(
|
||||
"D3D12 CreatePipelineState end steady_ms=" + std::to_string(pipelineEndMs) +
|
||||
" total_ms=" + std::to_string(pipelineEndMs - pipelineStartMs));
|
||||
}
|
||||
return pso;
|
||||
}
|
||||
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
#include <XCEngine/Resources/Volume/VolumeField.h>
|
||||
|
||||
#include <cstring>
|
||||
#include <utility>
|
||||
|
||||
namespace XCEngine {
|
||||
namespace Resources {
|
||||
@@ -13,6 +14,21 @@ void VolumeField::Release() {
|
||||
delete this;
|
||||
}
|
||||
|
||||
bool VolumeField::ApplyMetadata(VolumeStorageKind storageKind,
|
||||
const Math::Bounds& bounds,
|
||||
const Math::Vector3& voxelSize,
|
||||
const VolumeIndexBounds& indexBounds,
|
||||
Core::uint32 gridType,
|
||||
Core::uint32 gridClass) {
|
||||
m_storageKind = storageKind;
|
||||
m_bounds = bounds;
|
||||
m_voxelSize = voxelSize;
|
||||
m_indexBounds = indexBounds;
|
||||
m_gridType = gridType;
|
||||
m_gridClass = gridClass;
|
||||
return true;
|
||||
}
|
||||
|
||||
bool VolumeField::Create(VolumeStorageKind storageKind,
|
||||
const void* payload,
|
||||
size_t payloadSize,
|
||||
@@ -25,19 +41,32 @@ bool VolumeField::Create(VolumeStorageKind storageKind,
|
||||
return false;
|
||||
}
|
||||
|
||||
m_storageKind = storageKind;
|
||||
m_bounds = bounds;
|
||||
m_voxelSize = voxelSize;
|
||||
m_indexBounds = indexBounds;
|
||||
m_gridType = gridType;
|
||||
m_gridClass = gridClass;
|
||||
m_payload.Resize(payloadSize);
|
||||
ApplyMetadata(storageKind, bounds, voxelSize, indexBounds, gridType, gridClass);
|
||||
m_payload.ResizeUninitialized(payloadSize);
|
||||
std::memcpy(m_payload.Data(), payload, payloadSize);
|
||||
m_isValid = true;
|
||||
UpdateMemorySize();
|
||||
return true;
|
||||
}
|
||||
|
||||
bool VolumeField::CreateOwned(VolumeStorageKind storageKind,
|
||||
Containers::Array<Core::uint8>&& payload,
|
||||
const Math::Bounds& bounds,
|
||||
const Math::Vector3& voxelSize,
|
||||
const VolumeIndexBounds& indexBounds,
|
||||
Core::uint32 gridType,
|
||||
Core::uint32 gridClass) {
|
||||
if (payload.Empty()) {
|
||||
return false;
|
||||
}
|
||||
|
||||
ApplyMetadata(storageKind, bounds, voxelSize, indexBounds, gridType, gridClass);
|
||||
m_payload = std::move(payload);
|
||||
m_isValid = true;
|
||||
UpdateMemorySize();
|
||||
return true;
|
||||
}
|
||||
|
||||
void VolumeField::UpdateMemorySize() {
|
||||
m_memorySize = sizeof(VolumeField) +
|
||||
m_name.Length() +
|
||||
|
||||
@@ -2,6 +2,7 @@
|
||||
|
||||
#include <XCEngine/Core/Asset/ArtifactFormats.h>
|
||||
#include <XCEngine/Core/Asset/ResourceManager.h>
|
||||
#include <XCEngine/Debug/Logger.h>
|
||||
#include <XCEngine/Resources/Volume/VolumeField.h>
|
||||
|
||||
#if defined(XCENGINE_HAS_NANOVDB)
|
||||
@@ -11,15 +12,23 @@
|
||||
#endif
|
||||
|
||||
#include <cmath>
|
||||
#include <chrono>
|
||||
#include <cstring>
|
||||
#include <filesystem>
|
||||
#include <fstream>
|
||||
#include <utility>
|
||||
|
||||
namespace XCEngine {
|
||||
namespace Resources {
|
||||
|
||||
namespace {
|
||||
|
||||
void LogVolumeTraceFileSystem(const std::string& message) {
|
||||
Containers::String entry("[VolumeTrace] ");
|
||||
entry += message.c_str();
|
||||
Debug::Logger::Get().Info(Debug::LogCategory::FileSystem, entry);
|
||||
}
|
||||
|
||||
Containers::String GetResourceNameFromPath(const Containers::String& path) {
|
||||
const std::filesystem::path filePath(path.CStr());
|
||||
const std::string fileName = filePath.filename().string();
|
||||
@@ -29,28 +38,26 @@ Containers::String GetResourceNameFromPath(const Containers::String& path) {
|
||||
return path;
|
||||
}
|
||||
|
||||
LoadResult CreateVolumeFieldResource(const Containers::String& path,
|
||||
VolumeStorageKind storageKind,
|
||||
const Math::Bounds& bounds,
|
||||
const Math::Vector3& voxelSize,
|
||||
const VolumeIndexBounds& indexBounds,
|
||||
Core::uint32 gridType,
|
||||
Core::uint32 gridClass,
|
||||
const void* payload,
|
||||
size_t payloadSize) {
|
||||
LoadResult CreateOwnedVolumeFieldResource(const Containers::String& path,
|
||||
VolumeStorageKind storageKind,
|
||||
const Math::Bounds& bounds,
|
||||
const Math::Vector3& voxelSize,
|
||||
const VolumeIndexBounds& indexBounds,
|
||||
Core::uint32 gridType,
|
||||
Core::uint32 gridClass,
|
||||
Containers::Array<Core::uint8>&& payload) {
|
||||
auto* volumeField = new VolumeField();
|
||||
|
||||
IResource::ConstructParams params;
|
||||
params.name = GetResourceNameFromPath(path);
|
||||
params.path = path;
|
||||
params.guid = ResourceGUID::Generate(path);
|
||||
params.memorySize = payloadSize;
|
||||
params.memorySize = payload.Size();
|
||||
volumeField->Initialize(params);
|
||||
|
||||
if (!volumeField->Create(
|
||||
if (!volumeField->CreateOwned(
|
||||
storageKind,
|
||||
payload,
|
||||
payloadSize,
|
||||
std::move(payload),
|
||||
bounds,
|
||||
voxelSize,
|
||||
indexBounds,
|
||||
@@ -77,14 +84,19 @@ std::filesystem::path ResolveVolumeFieldPath(const Containers::String& path) {
|
||||
|
||||
LoadResult LoadVolumeFieldArtifact(const Containers::String& path) {
|
||||
const std::filesystem::path resolvedPath = ResolveVolumeFieldPath(path);
|
||||
const auto totalStart = std::chrono::steady_clock::now();
|
||||
|
||||
const auto openStart = std::chrono::steady_clock::now();
|
||||
std::ifstream input(resolvedPath, std::ios::binary);
|
||||
const auto openEnd = std::chrono::steady_clock::now();
|
||||
if (!input.is_open()) {
|
||||
return LoadResult(Containers::String("Failed to read volume artifact: ") + path);
|
||||
}
|
||||
|
||||
const auto headerStart = std::chrono::steady_clock::now();
|
||||
VolumeFieldArtifactHeader header;
|
||||
input.read(reinterpret_cast<char*>(&header), sizeof(header));
|
||||
const auto headerEnd = std::chrono::steady_clock::now();
|
||||
if (!input) {
|
||||
return LoadResult(Containers::String("Failed to parse volume artifact header: ") + path);
|
||||
}
|
||||
@@ -98,11 +110,27 @@ LoadResult LoadVolumeFieldArtifact(const Containers::String& path) {
|
||||
}
|
||||
|
||||
Containers::Array<Core::uint8> payload;
|
||||
payload.Resize(static_cast<size_t>(header.payloadSize));
|
||||
payload.ResizeUninitialized(static_cast<size_t>(header.payloadSize));
|
||||
const auto payloadReadStart = std::chrono::steady_clock::now();
|
||||
input.read(reinterpret_cast<char*>(payload.Data()), static_cast<std::streamsize>(header.payloadSize));
|
||||
const auto payloadReadEnd = std::chrono::steady_clock::now();
|
||||
if (!input) {
|
||||
return LoadResult(Containers::String("Failed to read volume artifact payload: ") + path);
|
||||
}
|
||||
const auto totalEnd = std::chrono::steady_clock::now();
|
||||
LogVolumeTraceFileSystem(
|
||||
"VolumeFieldLoader Artifact path=" + std::string(path.CStr()) +
|
||||
" resolved=" + resolvedPath.generic_string() +
|
||||
" open_ms=" +
|
||||
std::to_string(std::chrono::duration_cast<std::chrono::milliseconds>(openEnd - openStart).count()) +
|
||||
" header_ms=" +
|
||||
std::to_string(std::chrono::duration_cast<std::chrono::milliseconds>(headerEnd - headerStart).count()) +
|
||||
" payload_read_ms=" +
|
||||
std::to_string(std::chrono::duration_cast<std::chrono::milliseconds>(payloadReadEnd - payloadReadStart).count()) +
|
||||
" total_ms=" +
|
||||
std::to_string(std::chrono::duration_cast<std::chrono::milliseconds>(totalEnd - totalStart).count()) +
|
||||
" payload_bytes=" +
|
||||
std::to_string(payload.Size()));
|
||||
|
||||
Math::Bounds bounds;
|
||||
bounds.SetMinMax(header.boundsMin, header.boundsMax);
|
||||
@@ -115,15 +143,14 @@ LoadResult LoadVolumeFieldArtifact(const Containers::String& path) {
|
||||
indexBounds.maxY = header.indexBoundsMax[1];
|
||||
indexBounds.maxZ = header.indexBoundsMax[2];
|
||||
|
||||
return CreateVolumeFieldResource(path,
|
||||
static_cast<VolumeStorageKind>(header.storageKind),
|
||||
bounds,
|
||||
header.voxelSize,
|
||||
indexBounds,
|
||||
header.gridType,
|
||||
header.gridClass,
|
||||
payload.Data(),
|
||||
payload.Size());
|
||||
return CreateOwnedVolumeFieldResource(path,
|
||||
static_cast<VolumeStorageKind>(header.storageKind),
|
||||
bounds,
|
||||
header.voxelSize,
|
||||
indexBounds,
|
||||
header.gridType,
|
||||
header.gridClass,
|
||||
std::move(payload));
|
||||
}
|
||||
|
||||
#if defined(XCENGINE_HAS_NANOVDB)
|
||||
@@ -176,8 +203,11 @@ LoadResult LoadNanoVDBSourceFile(const Containers::String& path) {
|
||||
}
|
||||
|
||||
try {
|
||||
const auto totalStart = std::chrono::steady_clock::now();
|
||||
const auto readStart = std::chrono::steady_clock::now();
|
||||
nanovdb::GridHandle<nanovdb::HostBuffer> handle =
|
||||
nanovdb::io::readGrid<nanovdb::HostBuffer>(resolvedPath.string());
|
||||
const auto readEnd = std::chrono::steady_clock::now();
|
||||
if (!handle || handle.data() == nullptr || handle.bufferSize() == 0u) {
|
||||
return LoadResult(Containers::String("Failed to parse NanoVDB grid payload: ") + path);
|
||||
}
|
||||
@@ -199,7 +229,24 @@ LoadResult LoadNanoVDBSourceFile(const Containers::String& path) {
|
||||
}
|
||||
}
|
||||
|
||||
return CreateVolumeFieldResource(
|
||||
Containers::Array<Core::uint8> payload;
|
||||
payload.ResizeUninitialized(static_cast<size_t>(handle.bufferSize()));
|
||||
const auto copyStart = std::chrono::steady_clock::now();
|
||||
std::memcpy(payload.Data(), handle.data(), payload.Size());
|
||||
const auto copyEnd = std::chrono::steady_clock::now();
|
||||
const auto totalEnd = std::chrono::steady_clock::now();
|
||||
LogVolumeTraceFileSystem(
|
||||
"VolumeFieldLoader NanoVDB path=" + std::string(path.CStr()) +
|
||||
" read_ms=" +
|
||||
std::to_string(std::chrono::duration_cast<std::chrono::milliseconds>(readEnd - readStart).count()) +
|
||||
" memcpy_ms=" +
|
||||
std::to_string(std::chrono::duration_cast<std::chrono::milliseconds>(copyEnd - copyStart).count()) +
|
||||
" total_ms=" +
|
||||
std::to_string(std::chrono::duration_cast<std::chrono::milliseconds>(totalEnd - totalStart).count()) +
|
||||
" payload_bytes=" +
|
||||
std::to_string(payload.Size()));
|
||||
|
||||
return CreateOwnedVolumeFieldResource(
|
||||
path,
|
||||
VolumeStorageKind::NanoVDB,
|
||||
bounds,
|
||||
@@ -207,8 +254,7 @@ LoadResult LoadNanoVDBSourceFile(const Containers::String& path) {
|
||||
indexBounds,
|
||||
gridType,
|
||||
gridClass,
|
||||
handle.data(),
|
||||
static_cast<size_t>(handle.bufferSize()));
|
||||
std::move(payload));
|
||||
} catch (const std::exception& e) {
|
||||
return LoadResult(
|
||||
Containers::String("Failed to parse NanoVDB file: ") +
|
||||
|
||||
@@ -2,6 +2,8 @@
|
||||
|
||||
#include <XCEngine/Resources/Volume/VolumeField.h>
|
||||
|
||||
#include <utility>
|
||||
|
||||
using namespace XCEngine::Resources;
|
||||
|
||||
namespace {
|
||||
@@ -45,4 +47,38 @@ TEST(VolumeField, CreatePreservesPayloadAndMetadata) {
|
||||
EXPECT_GT(volumeField.GetMemorySize(), sizeof(VolumeField));
|
||||
}
|
||||
|
||||
TEST(VolumeField, CreateOwnedPreservesPayloadAndMetadata) {
|
||||
XCEngine::Containers::Array<unsigned char> payload;
|
||||
payload.Resize(4);
|
||||
payload[0] = 9u;
|
||||
payload[1] = 8u;
|
||||
payload[2] = 7u;
|
||||
payload[3] = 6u;
|
||||
|
||||
VolumeField volumeField;
|
||||
IResource::ConstructParams params;
|
||||
params.name = "cloud.xcvol";
|
||||
params.path = "Library/Artifacts/ab/main.xcvol";
|
||||
params.guid = ResourceGUID::Generate(params.path);
|
||||
volumeField.Initialize(params);
|
||||
|
||||
ASSERT_TRUE(volumeField.CreateOwned(
|
||||
VolumeStorageKind::NanoVDB,
|
||||
std::move(payload),
|
||||
XCEngine::Math::Bounds(),
|
||||
XCEngine::Math::Vector3(1.0f, 2.0f, 3.0f),
|
||||
VolumeIndexBounds{ 1, 2, 3, 4, 5, 6 },
|
||||
3u,
|
||||
4u));
|
||||
|
||||
EXPECT_TRUE(volumeField.IsValid());
|
||||
EXPECT_TRUE(payload.Empty());
|
||||
EXPECT_EQ(volumeField.GetPayloadSize(), 4u);
|
||||
EXPECT_EQ(static_cast<const unsigned char*>(volumeField.GetPayloadData())[0], 9u);
|
||||
EXPECT_EQ(volumeField.GetVoxelSize(), XCEngine::Math::Vector3(1.0f, 2.0f, 3.0f));
|
||||
EXPECT_EQ(volumeField.GetIndexBounds(), (VolumeIndexBounds{ 1, 2, 3, 4, 5, 6 }));
|
||||
EXPECT_EQ(volumeField.GetGridType(), 3u);
|
||||
EXPECT_EQ(volumeField.GetGridClass(), 4u);
|
||||
}
|
||||
|
||||
} // namespace
|
||||
|
||||
Reference in New Issue
Block a user