UntoldEngine profiling has two layers that are meant to be used together:
- Structured metrics (stable numbers for trends and regressions)
- Category logs (on-demand narrative traces for deep debugging)
Use structured metrics as the source of truth, then enable category logs only when you need extra context.
Enable the profiler at runtime:
enableEngineMetrics = trueOr via environment variable:
export UNTOLD_METRICS=1
./YourAppEnable periodic frame stats logging:
setEngineStatsLogging(
enabled: true,
profile: .compact, // or .verbose
intervalSeconds: 1.0
)With profile: .verbose, the engine logs 8 lines every interval:
Frame 1234 | CPU 12.34ms (81.0 fps, smoothed) GPU 8.45ms exec / 90.0 fps cadence [GPU-bound]
Timing: frame 12.34ms (raw CPU) | update 1.23ms | render 8.45ms | cull 0.45ms | stream 2.34ms | batchTick 0.12ms | batchRebuild 0.00ms
Render: draws 45 (opaque 32, transparent 3, shadow 8, batched 28) | triangles 125000 | visible 89
Culling: frustum 234/512 failed 278 | occlusion 198/234 failed 36 | usedHZB true validHZB true
Streaming: loaded 847 loading 3 unloaded 12 | active 3 | nearby 124 candidates 5 slots 4 | backlog 0 | pendingUploads 3 | gateMs 0.00
Streaming: tick=true workMs 1.23 | evictions 0 | avgLoadMs 45.67 | applyMs 0.89 | tileSwapWarn 0
Batching: groups 132 | batchedMeshes 916 | dirty 0→0 | defWork 0 skipComplex 2 | dispatched 0→0 groups | rebuilds/s 0 | rebuildMs 0.00
Memory: mesh 312/512mb | tex 198/512mb | total 50% | entities 847
Streaming line 1 — entity counts and slot pressure:
| Field | What it tells you |
|---|---|
loaded / loading / unloaded |
Entity residency state across the full scene. |
active |
Concurrent async loads in flight. |
nearby … candidates … slots |
Entities in range → eligible this tick → slots free. A gap between candidates and slots means the concurrency limit is the bottleneck. |
backlog |
Candidates that couldn't start because all slots were taken. Persistent nonzero = slot-starved. |
pendingUploads |
Meshes waiting on the GPU upload gate. |
gateMs |
Time the frame spent blocked waiting for the upload gate. |
Streaming line 2 — per-tick operational detail:
| Field | What it tells you |
|---|---|
tick=true/false |
Whether the streaming update ran this frame. false = throttle interval hasn't elapsed. |
workMs |
CPU time inside the streaming tick. Spikes here cause frame hitches. |
evictions |
Mesh evictions triggered by memory pressure this tick. Frequent nonzero = budget too tight. |
avgLoadMs |
Mean I/O + parse time per mesh load. High values = I/O bound, not slot-starved. |
applyMs |
Main-thread GPU upload cost when a completed load is applied. |
tileSwapWarn |
Cumulative tile representation thrash events (≥ 6 swaps in 5 s). |
Batching line — rebuild scheduler state:
| Field | What it tells you |
|---|---|
dirty X→X |
Dirty cells before vs after the work-budget prune. A large reduction means the scheduler is throttling rebuilds. |
defWork |
Cells deferred because they exceeded the per-tick CPU budget. Persistent nonzero = rebuild falling behind. |
skipComplex |
Cells permanently skipped by the complexity guard. These will never batch until the budget is raised. |
dispatched→groups |
Cells rebuilt this tick → batch groups produced. |
rebuildMs |
Total CPU time spent on rebuild work this tick. |
Memory line:
| Field | What it tells you |
|---|---|
mesh X/Ymb |
Geometry memory used vs geometry budget. |
tex X/Ymb |
Texture memory used vs texture budget. |
total X% |
Combined utilization across both pools. |
PRESSURE |
Appears when either pool hits ≥ 85 % utilization. |
Read profiler snapshots programmatically:
let metrics = EngineProfiler.shared.snapshot()
print("CPU mean: \(metrics.cpuFrame.meanMs) ms")
print("GPU mean: \(metrics.gpuCommandBuffer.meanMs) ms")
let frameStats = getEngineStatsSnapshot()
print("Frame: \(frameStats.frameIndex)")
print("Update: \(frameStats.timing.updateMs) ms")
print("Render: \(frameStats.timing.renderTotalMs) ms")High-volume instrumentation categories are disabled by default:
OOCTimingOOCStatusAssetLoader
Enable them when diagnosing OOC/loader behavior:
// Keep structured profiler metrics on
enableEngineMetrics = true
setEngineStatsLogging(enabled: true, profile: .compact, intervalSeconds: 1.0)
// Add focused trace logs
Logger.enable(category: .oocStatus) // OutOfCore lifecycle/status
Logger.enable(category: .oocTiming) // OOC timing detail
Logger.enable(category: .assetLoader) // progressive loader parse/uploadDisable after capture:
Logger.disable(category: .oocTiming)
Logger.disable(category: .oocStatus)
Logger.disable(category: .assetLoader)When FPS is lower than expected and the draws count in the engine stats is high relative to visible entities, the static batching system may not be producing as many groups as expected.
Enable the .batching log category to get a material-diversity report:
// One-shot snapshot at any point (e.g. after the scene finishes loading)
Logger.enable(category: .batching)
BatchingSystem.shared.logMaterialDiagnosticsNow()
Logger.disable(category: .batching)Or arm it to fire automatically every 30 seconds during a session:
Logger.enable(category: .batching)
// engine loop calls logMaterialDiagnosticsIfDue() each frame — no extra code needed[BatchMaterial] staticBatch=916 registered=916 resolved=916 batchable=87%
| singletons=119 groupable=797 | cellsBlocked=2
| uniqueMatLOD=80 singletonKeys=119 groupableKeys=132
[BatchMaterial] cell(0,-1,0) ents=224 uniqueKeys=49 singletons=22 groupable=27 groups=0 ratio=0.45
[BatchMaterial] cell(-1,-1,0) ents=238 uniqueKeys=42 singletons=16 groupable=26 groups=0 ratio=0.38
[BatchMaterial] cell(-2,-1,0) ents=49 uniqueKeys=24 singletons=12 groupable=12 groups=12 ratio=0.50
Scene-level fields:
| Field | What it tells you |
|---|---|
staticBatch |
How many entities have StaticBatchComponent. If this is 0 the asset was not set up for batching. |
registered / resolved |
Should match staticBatch once all tiles are resident. A large gap means entities are failing eligibility checks (animation, transparency). |
batchable |
Percentage of resolved entities that share a material key with a peer. Above 80% is healthy. |
cellsBlocked |
Cells rejected by the runtime complexity guard — their entities are rendered individually regardless of material sharing. |
uniqueMatLOD |
Distinct (material × LOD) keys globally. A small number (< 200) with high batchable is ideal. |
Per-cell fields:
| Field | What it tells you |
|---|---|
ents |
Entities registered in the cell. Very high counts (> 150) risk hitting the complexity guard. |
uniqueKeys |
Distinct material keys in the cell. Many keys spread across few entities = high diversity. |
groups |
Batch groups actually built. 0 with non-zero groupable means the cell has not been built yet or was blocked. |
ratio |
singletonKeys / uniqueKeys. Close to 1.0 means nearly every material is unique to one entity — nothing will batch. |
groups=0 on large cells, cellsBlocked > 0
The complexity guard is rejecting cells with too many vertices. Reduce the batch cell size (default 32 world units) so fewer entities land per cell, or raise maxRuntimeCellVertices / maxRuntimeCellBufferBytes in the platform tuning profile.
High uniqueMatLOD count, ratio near 1.0 per cell
Material diversity in the asset: each mesh instance uses a slightly different material, preventing grouping. The fix is asset-side — consolidate materials into shared PBR parameter sets or texture atlases.
batchable=0%, staticBatch=0
The scene entities were not tagged with StaticBatchComponent during export or scene setup. No batching will occur until the component is present.
resolved much lower than registered
Entities are failing resolveBatchCandidate. Common causes: transparent submeshes, skeleton/animation components, or preserveIdentity scene channels. Check the entity setup.
The key per-tick streaming fields (updateTriggered, updateWorkMs, nearbyEntitiesQueried, availableLoadSlots, evictionsPerformed, averageAsyncLoadMs, lastApplyLoadedMeshMs, tileSwapWarnings) are automatically included in the verbose stats output — see Verbose output format above. No extra setup is needed for routine streaming triage.
When you need to read streaming state in code rather than from the log, pull the diagnostics struct directly:
let diag = GeometryStreamingSystem.shared.getDiagnosticsSnapshot()
print("update triggered: \(diag.updateTriggered) workMs: \(diag.updateWorkMs)")
print("nearby queried: \(diag.nearbyEntitiesQueried) candidates: \(diag.loadCandidates)")
print("slots available: \(diag.availableLoadSlots) started: \(diag.startedLoads)")
print("evictions: \(diag.evictionsPerformed) tileSwapWarnings: \(diag.tileSwapWarnings)")
print("avg async load: \(diag.averageAsyncLoadMs) ms")This struct also contains fields not in the verbose snapshot: startedLoads, activeLoadsAtUpdateStart/End, lastAsyncLoadMs, lastAsyncReloadLODMs, lastFailedAsyncLoadMs, and lastUnloadMeshMs.
| Field | What it tells you |
|---|---|
updateTriggered |
Whether the streaming tick ran this frame. false means the throttle interval hasn't elapsed yet. |
updateWorkMs |
CPU time spent inside the streaming update. Spikes here cause frame hitches. |
nearbyEntitiesQueried |
How many entities the frustum/radius gate evaluated. |
loadCandidates / startedLoads |
How many entities were eligible vs actually kicked off. A gap means slots were full. |
availableLoadSlots |
Concurrency slots free at the start of the tick. 0 = slot-starved. |
evictionTriggered / evictionsPerformed |
Whether memory pressure forced an eviction pass. Frequent evictions indicate the budget is too tight for the scene. |
lastAsyncLoadMs / averageAsyncLoadMs |
I/O + parse time per mesh. High values mean the bottleneck is I/O, not slot count. |
lastApplyLoadedMeshMs |
Main-thread GPU upload cost when a mesh completes loading. |
tileSwapWarnings |
Count of tile representation thrash events (≥ 6 swaps in 5 s). Nonzero means a tile is oscillating between LOD levels. |
For a one-shot console dump of streaming counts, cache state, and memory budget together:
GeometryStreamingSystem.shared.printStats()Output includes: loaded / loading / unloaded entity counts, active load slot usage, cached file count, total cache memory, and mesh budget utilization.
Enable the .tileStreaming category for event-level traces (tile parse timeouts, eviction warnings, swap-thrash alerts):
Logger.enable(category: .tileStreaming)
// ... reproduce the issue ...
Logger.disable(category: .tileStreaming)Memory usage (mesh mb, texture mb, combined utilization, entity count, and pressure flag) is included in the verbose stats output automatically — see the Memory line in Verbose output format above.
For more detail than the one-line summary, MemoryBudgetManager provides two additional paths.
logStatus() fires automatically whenever memory crosses the high-water or low-water mark. No setup required — watch the log for:
MemoryBudgetManager Status:
- Mesh Memory: 312 MB / 512 MB (60.9%)
- Texture Memory: 198 MB / 512 MB (38.7%)
- Total GPU Memory: 510 MB / 1024 MB (49.8%)
- Tracked Entities: 847
- Under Pressure: false
This fires at both the high-water and low-water thresholds, so you get one log when pressure starts and another when it clears.
Call at any point to log the current state regardless of pressure:
MemoryBudgetManager.shared.logStatus()let stats = MemoryBudgetManager.shared.getStats()
print("mesh: \(stats.meshMemoryUsed) / \(stats.geometryBudget) util: \(stats.geometryUtilization)")
print("texture: \(stats.textureMemoryUsed) / \(stats.textureBudget) util: \(stats.textureUtilization)")
print("pressure: \(stats.isUnderPressure)")The key rebuild-scheduler fields (dirty X→X, defWork, skipComplex, dispatched→groups, rebuildMs) are included in the verbose stats output automatically — see the Batching line in Verbose output format above.
getTickDiagnosticsSnapshot() exposes the complete per-tick struct, including fields not in the verbose output (deferredByQuiescence, deferredByVisibility, appliedArtifacts, inFlightBuildCells, maxCellRebuildMs, rebuiltVertices, rebuiltBufferBytes):
let diag = BatchingSystem.shared.getTickDiagnosticsSnapshot()
print("dirty cells: \(diag.dirtyCellsBeforePrune) → \(diag.dirtyCellsAfterPrune) after prune")
print("dispatched builds: \(diag.dispatchedBuilds) applied: \(diag.appliedArtifacts)")
print("deferred by quiescence: \(diag.deferredByQuiescence)")
print("deferred by work budget: \(diag.deferredByWorkBudget)")
print("rebuild work: \(diag.rebuildWorkMs) ms max cell: \(diag.maxCellRebuildMs) ms")| Field | What it tells you |
|---|---|
dirtyCellsBeforePrune / AfterPrune |
How many cells needed rebuild vs how many survived the work-budget prune. A large prune gap means the scheduler is throttling. |
deferredByQuiescence |
Cells skipped because entities were still arriving. Normal during initial scene load. |
deferredByVisibility |
Cells skipped because they were outside the camera frustum. |
deferredByWorkBudget |
Cells skipped to stay within the per-tick CPU budget. Persistent nonzero values mean rebuild is falling behind. |
skippedByComplexityGuard |
Cells with too many vertices for the runtime budget — they will never batch until the budget is raised or cell size is reduced. |
rebuildWorkMs / maxCellRebuildMs |
Total and worst-case rebuild time for this tick. |
rebuiltVertices / rebuiltBufferBytes |
Output size of the rebuild work — useful for estimating GPU buffer pressure. |
For a one-line summary suitable for frame logging:
print(BatchingSystem.shared.diagnosticSummary())
// batch: registered=916 dirty=3 rebuildMs=0.4 groups=132The following helpers use print() rather than the Logger and are intended for quick local inspection. They are not gated by log categories and have no throttle.
| Call | What it prints | Platform |
|---|---|---|
GeometryStreamingSystem.shared.printStats() |
Streaming counts + cache stats + memory budget in one block | All |
RealSurfacePlaneStore.shared.logAllPlanes() |
All ARKit-detected planes: alignment, classification, Y height, extent | AR only |
These are best used with a breakpoint or a temporary onUpdate call. Do not leave them in shipped code.
When metrics are enabled, the engine emits signpost scopes:
FrameUpdateRenderPrepEncodeSubmit
To inspect timeline data:
- Open Instruments
- Choose Points of Interest
- Filter subsystem to
com.untoldengine.profiling - Run the app with
enableEngineMetrics = true(orUNTOLD_METRICS=1)
EngineProfileris available in all configs, but disabled by default until enabled at runtime.EngineStatscollection is compiled in debug by default (ENGINE_STATS_ENABLED).- For release profiling with
EngineStats, build with-DENGINE_STATS_ENABLED.
If ENGINE_STATS_ENABLED is not compiled in, getEngineStatsSnapshot() returns default values and setEngineStatsLogging(...) is effectively a no-op.
- Category filtering applies to
Logger.log(...)debug/info trace paths. - Warnings and errors still emit regardless of category state.
- Logger messages are lazily evaluated, so disabled categories avoid message-building cost.
#if DEBUG
let metricsLogger = MetricsDebugLogger()
metricsLogger.logIfNeeded() // throttle-prints approximately once per second
#endifProfiler hooks are already integrated into:
UntoldEngine.swift(runFrame)RenderingSystem.swift(UpdateRenderingSystem)UntoldEngineXR.swift(executeXRSystemPass)UntoldEngineAR.swift(draw)BatchingSystem.swift(logMaterialDiagnosticsIfDue— fires automatically every 30 s when the.batchingcategory is enabled)