GPT Realtime 2.0 — Quick Reference

Prompt Components Checklist

Component	One-liner
Role and Objective	Who the assistant is and the task it’s there to complete.
Personality and Tone	Warmth, professionalism, conversational style.
Language	Default response language and spoken phrasing style.
Greeting	Standard polite opener for new conversations.
Reasoning	Controls how deeply the model thinks before responding. Voice AI Model Settings/Model Parameter Controls —> Reasoning Efforts. Takes values from 0 to 4. 0=minimal, 1=low (default), 2=medium, 3=high, 4= very high. Higher values increase latency and cost.
Message Channels	Spoken vs. hidden content; no internal jargon spoken aloud.
Preambles	Short spoken update before noticeable internal work (tool call, multi-step reasoning, lookup).
Verbosity	How short or long spoken replies should be.
Tools	When to call tools, how to interpret results, how to handle failures.
Don’t Announce Tools	Never expose tool names, system mechanics, or backend details.
Input Gathering	Collect all required inputs before invoking a tool.
Capability Boundaries	Don’t claim abilities beyond what’s actually available.
Unclear Audio	Ask the user to repeat; don’t guess or preamble on bad audio.
Entity Capture	Read back and confirm critical entities (IDs, numbers, amounts).
Long Context Behavior	Reuse customer details and prior tool results across the session.
No Human Handoff	Never offer to transfer or escalate to a human agent.
Escalation / Limits	Acknowledge limits honestly; suggest in-scope alternatives.

Structure the Prompt in Labeled Sections: Organize the system prompt under clear named sections (Role, Tools, Preambles, Entity Capture, etc.) rather than a single prose block. The model locates relevant instructions faster, conflicting rules are easier to spot, and iterating on one behavior doesn’t risk breaking another. Only include sections relevant to your use case.
Tune Reasoning Effort to the Task: Reasoning effort trades latency for depth of thought. Default to the lowest setting that still gets the job done.
Use Preambles Deliberately: Trigger a preamble before a tool call that may take noticeable time, before multi-step reasoning, or before an escalation — anywhere silence would feel unresponsive.
Define Verbosity explicitly: Telling the model to “be brief” leaves too much open to interpretation. Instead, specify expected length per task type — for example, one to two sentences for direct answers, one question at a time for clarifications, a summary-then-next-step pattern for tool results, and a tradeoffs-focused structure for comparisons.
Capture Exact Entities Carefully: Collect one value at a time, confirm identifiers before tool calls.