tutorial · 2026-01-14
Load Voice Audio On Demand With TSoftObjectPtr in UE5
Keep hundreds of SoundWave clips off the heap until the moment a line actually plays.
Why hundreds of voice lines shouldn't all load at once
A talkative NPC can ship with hundreds of recorded lines, and the naive way to wire that up is to hold a hard USoundWave reference for every clip. The trouble is that a hard reference forces the engine to load the referenced asset whenever the referencing asset loads. Reference a DataTable full of SoundWaves with raw pointers and you have just told Unreal to pull every clip into memory the instant that table comes into scope, whether the player ever triggers those lines or not.
That is exactly the problem TSoftObjectPtr is built to solve, and it is why if you search for ue5 tsoftobjectptr soundwave load on demand you land squarely on the right tool. A soft object pointer stores a path to the asset rather than the loaded asset itself, so the data sits in your table costing almost nothing until you explicitly resolve it. The Assassin Dialogue Lore Pack uses this pattern by design: its DT_Dialogue rows reference audio through a VoiceAudio column typed as TSoftObjectPtr<USoundWave>, so the clips are not loaded until first played.
The scale makes the saving concrete. That free pack carries around 570 voice lines totalling roughly 72 minutes of audio across 16 categories. If you only ever play combat barks during a fight and greetings on overlap, there is no reason for the story or weather lines to be resident in memory at all. Soft references let you load just the clip you are about to play and let everything else stay on disk.
How TSoftObjectPtr defers loading
A TSoftObjectPtr is essentially a wrapper around an asset path with a weak handle to the object if it happens to be loaded. Declaring a column as TSoftObjectPtr<USoundWave> in a DataTable row, the way the pack's DT_Dialogue schema does, means the row knows where the clip lives without ever pulling it in. The fields around it stay lightweight too: the schema keeps Name, DialogueName, ResponseText, CharacterName, EmotionalTone, ContextTags and NPCType as plain strings, with VoiceAudio as the only soft asset reference.
The typical runtime flow is to query the table by context, then resolve the chosen row's soft pointer only at the moment of playback. In Blueprint that means getting the row names from DT_Dialogue, filtering rows whose ContextTags contain the situation you want such as social/greeting or combat, picking a random match, and only then resolving its VoiceAudio. In C++ the equivalent is to load the DataTable, call GetAllRows on your row struct, filter on the ContextTags string, and resolve the soft pointer on the winner.
Because the pack tags every line as category/subcategory/size, you can filter cheaply before you ever touch audio. Lines also come in four length tiers, SM through XL, which matters for how aggressively you defer the heavy ones. The point is that all that filtering happens on string data; the SoundWave only enters memory once you have decided you genuinely need it.
LoadSynchronous vs async load before play
Once you have the row you want, you have two ways to turn its soft pointer into a playable USoundWave, and the choice is about timing.
1. The simplest is LoadSynchronous. Call it on the VoiceAudio soft pointer and it blocks until the clip is resident, then hands you the USoundWave to feed straight into Play Sound 2D or Play Sound 3D. This is the path the pack's own how-to wiring uses, and for a single short bark it is usually fine: the clips are small and the hitch is negligible.
2. For larger lines you want to avoid the stall. The pack's longest tier, XL, holds paragraph-length delivery, and synchronously loading one of those mid-frame can cause a visible hitch. The documented guidance is to use audio streaming for XL lines and to pre-cache common categories at level load, so the clip is already in memory by the time the player triggers it. In practice that means kicking off an asynchronous load ahead of time and only playing once it has resolved, rather than loading on the same frame you want sound.
3. Either way, lean on the structure to keep loads cheap. Cache your DataTable reference instead of re-resolving it, and pre-filter rows by category at initialisation so the hot path is just pick-a-row and resolve-one-pointer. The goal is one clip in flight at a time, loaded as late as you can afford and as early as the player will notice a gap.
Cook-time audio compression
Soft references control what is in memory at runtime; cooking controls what ends up in your packaged build. The pack ships its clips as USoundWave PCM assets, which are uncompressed and large on disk in the editor. You do not ship them that way.
When you cook, Unreal compresses audio for the target platform, typically to Ogg Vorbis, so the packaged SoundWaves are far smaller than their editor PCM masters. This happens to the assets behind your soft pointers automatically; you are not maintaining a second compressed copy. Combined with on-demand loading, you get the full picture: compressed clips on disk, and only the handful you are actively playing decompressed into memory.
If you want a free, already-correctly-wired example to study, the Assassin Dialogue Lore Pack is the one this article is built around. Its DT_Dialogue rows resolve SoundWaves through TSoftObjectPtr exactly as described here, so you can migrate the Content folder, open the table, and trace the load path end to end before applying the same approach to your own dialogue system.
Hard reference vs TSoftObjectPtr for SoundWave audio
| Concern | Hard USoundWave reference | TSoftObjectPtr<USoundWave> |
|---|---|---|
| When the clip loads | When the referencing asset loads | Only when you resolve it (e.g. before play) |
| Memory cost at rest | Full decompressed clip resident | Just an asset path |
| Effect of a big DataTable | All referenced clips pulled in at once | Nothing pulled until a row is played |
| Control over timing | None — implicit | LoadSynchronous or async load on demand |
Why the dialogue pack stores VoiceAudio as a soft pointer rather than a hard USoundWave reference.
FAQ
How do I make a UE5 SoundWave load on demand instead of at startup?
Reference the clip with a TSoftObjectPtr<USoundWave> rather than a hard USoundWave pointer. The soft pointer stores only the asset path, so nothing loads until you explicitly resolve it. This is exactly how the Assassin Dialogue Lore Pack's DT_Dialogue VoiceAudio column works: clips are not loaded until first played.
Should I use LoadSynchronous or an async load before playing a line?
For short barks, LoadSynchronous on the soft pointer is fine — it blocks briefly and hands you the SoundWave to play. For long lines (the pack's XL tier), prefer an asynchronous load or audio streaming and pre-cache common categories at level load, so a paragraph-length clip does not cause a hitch when it resolves mid-frame.
Does deferred loading make my packaged build smaller?
On-demand loading keeps clips out of memory at runtime; cooking is what shrinks the build. When you cook, Unreal compresses audio for the target platform (typically Ogg Vorbis), so packaged SoundWaves are much smaller than the editor PCM masters behind your soft pointers.
How do I pick the right line without loading every clip first?
Filter on the row's string fields before touching audio. The pack tags lines as category/subcategory/size in ContextTags, so you can get the row names, keep rows whose ContextTags contain your situation, pick one at random, and only then resolve that single row's VoiceAudio soft pointer.
Assassin Dialogue Lore Pack
A free lore pack of dark, stealthy assassin and rogue dialogue — 72 minutes of professionally delivered lines covering combat, lore and ambient barks. Drop-in audio cues for any UE5 RPG.