article · 2026-01-23

Why AI Coding Agents Hallucinate in Unreal — and How a Glass-Box Bridge Fixes It

Stateless agents guess at the editor's state and confidently get it wrong. Here's how grounding every response in live world state stops the hallucination.

Mythic Dev Assist
Featured on Fab Mythic Dev Assist A queryable causal world-model of UE5 for AI coding agents, over MCP.
$24.99 Get on Fab →
101
Declarative HTTP action routes
14
Generic MCP tools wrapping the bridge
18
Always-on observability channels
5.3-5.7+
Supported engine versions

Why your AI agent hallucinates in the Unreal editor

You ask Claude Code or Cursor to spawn the actor it just wrote, nudge it onto the navmesh, and screenshot it. It replies that everything worked perfectly — except nothing spawned, the editor wasn't even in Play, and the screenshot path points at a file that was never written. The agent didn't lie to you on purpose. It hallucinated because it is fundamentally blind: an AI coding agent is a stateless text process, and the Unreal editor is a live, stateful program it cannot see into.

The failure mode is always the same. The agent builds a mental model of your project from the code it read minutes ago, then keeps acting on that stale model long after the editor has moved on. It assumes a map is loaded, assumes an actor exists, assumes a property took the value it sent. When an AI agent hallucinates in the Unreal editor, it is filling the gap between what it last knew and what is true right now with its best statistical guess — and a guess about live engine state is wrong more often than it is right.

Grounding an AI in game-engine state means closing that gap on every single turn: making the editor tell the agent what is actually true rather than letting the agent infer it. That is exactly the design thesis behind MythicDevAssist (MDA), an agent-native editor bridge for Unreal Engine 5. MDA runs inside the editor process as an Engine Subsystem and exposes an in-editor HTTP server bound to loopback only (default 127.0.0.1:7779), plus a companion Python MCP server. It is the bridge layer, not the AI — you bring your own agent — and its whole job is to keep that agent honest.

World Pulse: every response carries live world state

The single most effective antidote to a stale mental model is to refuse to let it go stale. MDA does this with World Pulse: every response the bridge returns carries a _meta block describing the live world at the moment of the call. You don't have to ask 'is the game running?' as a separate step and hope the answer is still valid by the next call — the answer rides along with the result of whatever you just did.

Concretely, World Pulse reports the context (pie, editor, or headless), the current frame, whether Play-In-Editor is running, the live actor count, the loaded map, the timescale, the paused flag, and the game time. So when the agent moves an actor and the response comes back with pie_running false and the map it didn't expect, the agent immediately knows its assumption was wrong — instead of narrating a success that never happened.

Because the pulse is attached to ordinary action responses rather than gated behind a polling tool, the agent is continually re-grounded as a side effect of doing work. There is no moment where it is operating on yesterday's picture of the world. That continuous correction is the difference between an agent that drifts into confident fiction and one that notices the moment reality diverges from its plan.

Rich mutation responses beat '{success: true}'

Most editor automation returns a bare acknowledgement: you ask it to do a thing, it says it did the thing. That is precisely the soil hallucination grows in, because '{success: true}' gives the agent nothing to check its own work against — so it invents the details it wishes it had. MDA instead makes every mutation read back what actually changed.

Destroy an actor and the response names the actors that were removed, along with their classes and world locations — so the agent can confirm it deleted the right thing rather than assuming it did. Set a property and you get the read-back values straight from the object plus a per-property breakdown of any failures, which means a property that silently refused to take is reported as a failure instead of being papered over. Move an actor and the response returns the post-collision-adjusted position, so the agent learns where the actor truly ended up rather than where it asked it to go.

This read-back-everything discipline is what lets a stateless agent self-correct. When the value it sent and the value it gets back disagree, that contradiction is sitting right there in the response for the model to reason about. For destructive work there is also a dry-run path: previewing a console command such as 'DestroyAll BP_Enemy_C' reports a would_affect count before anything is committed, so the agent can check the blast radius rather than discovering it afterwards.

Project Twin: stop guessing the engine version and plugins

A huge share of agent mistakes trace back to assumptions about the project itself: which engine version it targets, whether it's a source build, which plugins are even enabled. An agent that assumes Niagara is available, or that an API exists in your engine version when it doesn't, will write code that cannot possibly work and then explain at length why it should.

MDA closes this with Project Twin, exposed as the mda://project-twin resource and attached on the first call so the agent is oriented before it acts. It reports the engine version, the source-build flag, the build configuration, the enabled plugins, and a capabilities list (python, niagara, gameplay_abilities, ai_module, navigation), plus the default map and system info. The agent no longer guesses whether it can spawn a Niagara system or call into the Gameplay Abilities System — it reads the capabilities and knows.

This pairs with structured discovery for the project's code and content. The repo_map tool returns a ranked source map that weights UCLASS, USTRUCT, UFUNCTION and UPROPERTY declarations so an agent can orient on an unfamiliar codebase before it edits anything, while mda_world find locates actors by class, name, tag or component with class-hierarchy matching and mda_asset search queries the Asset Registry with parent-chain and dependency counts. Grounding starts before the first mutation, not after the first mistake.

Semantic hints and pattern-based warnings

Even a well-grounded agent will occasionally walk into the same wall twice. MDA treats repeated failure as a signal rather than noise. Every error response carries a semantic agent_hint with a specific recovery path — not a generic 'something went wrong', but a pointed suggestion about what to try next — so the model has a concrete next move instead of guessing.

On top of that, the bridge watches for patterns: after three or more failures of the same action, it proactively attaches recovery guidance to steer the agent off the path it keeps failing on. This is the structural fix for the classic agent doom-loop, where a model retries an identical broken call a dozen times because nothing in the response ever told it to change approach.

Underneath all of this sits a deeper observability layer that the same grounding philosophy extends into debugging. MDA streams 18 always-on channels (rendering, memory, gc_events, compilation_events, physics, niagara, ai, gameplay and more) into a per-session SQLite database under your project's Saved/Observations folder, with the agent able to write SQL against unified session history. That turns vague questions into checkable ones: a jittering NPC becomes a query for snapshots WHERE jerk > 10000, an actor that fell through the world becomes a navmesh-status filter. The agent stops speculating about why something looks wrong and starts measuring it.

Getting grounded: setting it up

Installation follows the normal plugin path. Enable MythicDevAssist under Edit, Plugins, Developer Tools and restart the editor; its engine dependencies (SQLiteCore, Niagara, GameplayAbilities) auto-enable. A dockable MDA dashboard then shows the live HTTP endpoint, the session database path, and a per-call action log with a full request and response viewer, so you can watch exactly what the agent is seeing.

For the AI side, install Python 3.10 or newer on PATH and run 'pip install mcp' — that is the MCP server's only Python dependency. Register the server with your agent: Claude Code via 'claude mcp add mda --command python --args <path>/mda_mcp_server.py', Cursor via its mcp.json, Codex via its config — all using the same command-and-args shape. Custom agent hosts can skip the Python layer entirely and speak MCP directly to the in-editor server over POST /mcp.

Verify the bridge is grounded by asking your agent to list MDA bridges; the response should show the running editor's PID, port, and project name. Because the runtime module ships as UncookedOnly, none of this lands in a cooked game — there is zero runtime cost in your packaged build. From there, the next useful step is to let the agent run a single real task end to end — spawn an actor it wrote, read back the World Pulse, screenshot it — and watch it correct itself the first time reality disagrees with its plan.

Stateless guessing vs glass-box grounding

Failure modeWhat a blind agent doesHow MDA grounds it
Stale world pictureAssumes a map is loaded and PIE is runningWorld Pulse _meta carries context, frame, pie_running, actor_count, map, timescale, game_time on every response
Unverified mutationsTrusts a bare success acknowledgementdestroy returns removed names/classes/locations; set_property returns read-back values + per-property failures; move returns adjusted position
Wrong project assumptionsGuesses the engine version and which plugins existProject Twin reports engine version, source_build, enabled plugins and a capabilities list, attached on first call
Repeated identical failuresRetries the same broken call in a loopSemantic agent_hint on every error, plus auto-attached recovery guidance after 3+ same-action failures

How MDA replaces an agent's assumptions with facts read back from the live editor.

FAQ

Why does my AI agent hallucinate in the Unreal editor?

Because it is a stateless text process acting on a stale mental model of a live, stateful program. The agent builds its picture of the project from code it read earlier, then keeps acting on that picture after the editor has moved on — filling the gap between what it last knew and what is true now with a guess. MDA closes that gap by attaching live world state to every response.

What does it actually mean to ground an AI in game-engine state?

It means making the editor tell the agent what is true rather than letting the agent infer it. In MDA that is World Pulse (live _meta on every response), rich mutation read-backs (what actually changed, not just '{success: true}'), and Project Twin (engine version, enabled plugins and capabilities). The agent reasons over facts read back from the editor instead of its own assumptions.

Is MythicDevAssist an AI, or does it replace my agent?

Neither — it is the bridge layer, not an AI. You bring your own agent (Claude Code, Cursor, Codex CLI, or any MCP client). MDA runs inside the UE5 editor as an Engine Subsystem and exposes an in-editor HTTP server plus a Python MCP server so that agent can drive and observe the editor instead of guessing.

Will it add overhead to my shipped game?

No. The runtime module ships as UncookedOnly, which means it is excluded from cooked shipping builds. The bridge exists only in the editor, so there is zero runtime cost in your packaged game.

Which engine versions and platforms does it support?

Packaged zips are present for Unreal Engine 5.3 through 5.7, and the listing states 5.3, 5.4, 5.5, 5.6 and 5.7+. The editor host is Windows 64-bit. The optional standalone MCP server is also mentioned for other platforms, but the plugin itself is Win64.

Get it on Fab

Mythic Dev Assist

Give AI coding agents (Claude Code, Cursor, any MCP client) eyes inside Unreal — a queryable causal world model exposing perception, memory, causality, verification and action through an in-editor HTTP bridge and an external MCP server. Observe, set, create, destroy and watch the editor programmatically.

$24.99USD · one-time · free updates
Report a bug