Prompting gpt-realtime

Our voice agents are built on top of ‘gpt-realtime’ a small, realtime, OpenAI model that is designed to be conversational. In order to build the best voice agents, you need to familiarize yourself with gpt-realtime’s capabilities, limitations and best practices. Remember: Most of the classic use cases for voice agents (banking, insurance, telecom, etc.) are already solved by some architectural pattern, you just need to find the right one and adapt it to your use case.

Start Here: The Bible

You MUST read and internalize this guide:
OpenAI Realtime Prompting Guide .
This guide contains the exact patterns the model was trained on and how to prompt it for those patterns.
It also contains example solutions to many use cases.
Important note: This guide contains instructions for prompting the model “Voice”, this is only relevant if you are using the Voice-To-Voice orchestrator.

Mental Model: Realtime Model Is a Thin Conversation Model

Short context window: Treat gpt-realtime as having very limited working memory.
- The model has about 32k context, actually the performance degrades after about 10k tokens.
Conversational, not reasoning:
- Great at turn-by-turn conversation, style, pacing, and making the interaction feel natural.
- Performs badly in these situations:
  - Given a long, strict prompt
  - When there are too many tools available
  - If tools return too much data
  - When the model needs to format data (such as dates)
  - When running pre-scripted conversations
Instruction following is not perfect:
- If you have a flow of “do X only if Y happens”, the model will sometimes do X even if Y did not happen.
- Use tools to enforce exact policy before doing an action.
The model likes simple words:
- Don’t use complex symbols, use the most common words.
- The prompt should be written so a 5 year old can understand it.
- The model was trained on text, give it the words that it most likely saw in its training data, most common english words
- technician_scheduling -> service_booking

Context Engineering ‘gpt-realtime’: A check-list.

Short prompt, only behavior-interaction instructions and tool-use logic.
8-10 tool at most.
All heavy lifting offloaded to tools or background LLMs inside those tools.
Do not trust the model to follow exact policy before doing an action, offload these decisions to a smarter LLM inside tools.
Keep the context short, clean and high-signal.

Agent Engineering

​Prompting gpt-realtime

​Start Here: The Bible

​Mental Model: Realtime Model Is a Thin Conversation Model

​Context Engineering ‘gpt-realtime’: A check-list.

Prompting gpt-realtime

Start Here: The Bible

Mental Model: Realtime Model Is a Thin Conversation Model

Context Engineering ‘gpt-realtime’: A check-list.