Building GM-Genie: A Zero-Tool Architecture for Cinematic AI Game Masters
These articles are AI-generated summaries. Please check the original sources for full details.
How I Built GM-Genie: A Cinematic AI Game Master with Gemini Live API
Vasilis Stefanopoulos developed GM-Genie for the Gemini Live Agent Challenge to create an immersive, voice-first RPG experience. The project successfully transitioned to a zero-tool architecture after function calling caused a 70% crash rate in voice mode.
Why This Matters
In high-concurrency, bidirectional audio environments like the Gemini Live API, traditional tool-calling patterns can introduce fatal latency and connection instability, leading to silent WebSocket failures. Moving logic to the server side via transcript analysis and pre-calculating state, such as deterministic dice pools, ensures a seamless user experience that does not rely on the model to orchestrate external API calls mid-stream. This architectural shift prioritizes connection stability and narrative flow over complex multi-agent orchestration.
Key Insights
- Function calling in gemini-2.5-flash-native-audio-latest caused WebSocket disconnects approximately 70% of the time, returning error codes 1000, 1008, or 1011 (Stefanopoulos, 2026).
- Zero-tool architectures improve reliability by using server-side SceneDetectors to trigger media events based on transcript patterns instead of model-dispatched tools.
- Deterministic state management via DicePools—pre-rolling results and injecting them into system prompts—eliminates the need for real-time tool calls for RNG during sessions.
- Continuous 16kHz audio streams outperform noise-gated streams because client-side gating creates fragmented bursts that break the Gemini API’s Voice Activity Detection (VAD).
- Server-side audio batching from 84-byte (2.6ms) AudioWorklet chunks to 3200-byte (100ms) batches is required for stable processing by the LiveRequestQueue.
Working Examples
Pre-rolled dice pool injected into system prompt to eliminate tool calls for RNG.
class DicePool:
def __init__(self, seed: int | None = None):
rng = random.Random(seed)
self.pool = {
"d4": [rng.randint(1, 4) for _ in range(30)],
"d20": [rng.randint(1, 20) for _ in range(40)],
}
self._idx: dict[str, int] = {k: 0 for k in self.pool}
def prompt_block(self) -> str:
lines = ["[PRE-ROLLED DICE POOL — use in order, top to bottom]"]
for k, vals in self.pool.items():
lines.append(f"{k}: {', '.join(str(v) for v in vals)}")
return "\n".join(lines)
Server-side audio batching logic to stabilize Gemini Live API ingestion.
MIC_BATCH_BYTES = 3200
async def _mic_sender(live_queue, mic_buffer):
while True:
chunk = await mic_buffer.get()
batch = chunk
while len(batch) < MIC_BATCH_BYTES:
try:
batch += mic_buffer.get_nowait()
except asyncio.QueueEmpty:
break
live_queue.send_realtime(
types.Blob(data=batch, mime_type="audio/pcm;rate=16000")
)
Practical Applications
- Use Case: GM-Genie uses a ‘Story Loom’ to generate campaign arcs using d12 tables to ensure narrative purpose. Pitfall: Using procedural generation without a structured arc results in generic, directionless stories.
- Use Case: Server-side SceneDetector monitors transcripts for visual cues like ‘you see’ to trigger image generation via gemini-3-pro-image-preview. Pitfall: Relying on the model to decide when to show images leads to hallucination and increased latency.
- Use Case: Character visual consistency is maintained by extracting a description once and injecting it into every scene prompt. Pitfall: Starting every generation from scratch causes character appearance to change inconsistently between images.
References:
Continue reading
Next article
Mastering Azure VM Provisioning: Lessons from 5 Common Terraform Errors
Related Content
Building a Terminal Arcade Game with Go
Developer Rad Ghost transforms an abandoned Go project into a fully functional terminal-based arcade game.
Building 'Trickster's Table': A Card Game Suite with Gemini AI Studio (zero coding)
Built a full-featured card game suite in hours using Gemini AI Studio without writing any code.
Game development with SpecKit, Rust and Bevy
Explore brkrs, a Breakout-style game built with Rust, Bevy, and SpecKit, showcasing ECS architecture and spec-first workflows.