Skip to main content

OpenAI GPT-Realtime-2 Voice Models Launch: What CRE Investors Need to Know

By Avi Hacker, J.D. · 2026-05-10

What is GPT-Realtime-2? GPT-Realtime-2 is OpenAI's new voice intelligence model, announced on May 7, 2026 as part of a three-model release that also includes GPT-Realtime-Translate and GPT-Realtime-Whisper. GPT-Realtime-2 is OpenAI's first voice model built with GPT-5-class reasoning, expands the context window from 32K to 128K tokens, and supports continuous live conversations that can handle interruptions, parallel tool calls, and graceful recovery. For CRE owners, brokers, and property managers, this is the first realistic voice stack for production tenant agents, multilingual leasing lines, and live broker tour documentation. For broader context on the tools shaping the market, see our pillar guide on AI property management.

Key Takeaways

  • OpenAI released GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper on May 7, 2026, bringing GPT-5-class reasoning, 70-language translation, and streaming transcription to voice agents.
  • GPT-Realtime-2 priced at $32 per 1M audio input tokens and $64 per 1M audio output tokens, with adjustable reasoning effort from minimal to xhigh to balance latency and depth.
  • Zillow, an early adopter, reported a 26-point lift in call success rate (95% vs 69%) on its hardest real estate voice benchmark.
  • CRE practitioners can deploy these models for 24-hour leasing lines, multilingual tenant calls, live tour transcription, and broker meeting capture without stitching together separate STT, LLM, and TTS pipelines.

GPT-Realtime-2 Explained

GPT-Realtime-2 is OpenAI's flagship voice model in the May 7, 2026 release. It is designed for continuous voice agents rather than the older speech-to-text plus LLM plus text-to-speech chain. Developers can configure adjustable reasoning effort from minimal to xhigh, enable preamble phrases like "let me check that" so the agent feels responsive, and let the model run parallel tool calls and narrate them audibly with phrases like "checking your calendar." Recovery behavior is stronger too. Instead of failing silently when a tool errors, the agent now says something like "I'm having trouble with that right now" and stays in the conversation.

On OpenAI's own benchmarks, GPT-Realtime-2 at the high setting scores 15.2% higher on Big Bench Audio than GPT-Realtime-1.5, and at xhigh scores 13.8% higher on Audio MultiChallenge for instruction following. The full announcement is on the OpenAI launch page.

What GPT-Realtime-Translate and GPT-Realtime-Whisper Add

  • GPT-Realtime-Translate: Real-time speech translation that takes more than 70 input languages and outputs in 13 target languages while keeping pace with the speaker. Priced at $0.034 per minute.
  • GPT-Realtime-Whisper: Streaming speech-to-text for live captions, meeting notes, and call documentation. Priced at $0.017 per minute.

Together with GPT-Realtime-2, this is the first time CRE operators can pull real-time translation, real-time transcription, and a reasoning voice agent from one provider on consistent infrastructure.

Key CRE Use Cases

  • 24-hour leasing lines: A voice agent handles inbound leasing calls after hours, qualifies prospects, books tours via tool calls into the calendar, and hands off to a human only when the lead is hot.
  • Multilingual tenant calls: GPT-Realtime-Translate lets multifamily, mixed-use, and industrial owners offer a live multilingual line without staffing native speakers in 70+ languages.
  • Live broker tour documentation: GPT-Realtime-Whisper transcribes the entire tour so the broker can stay present and still file a complete tour memo, with the LLM step summarizing into a deal note.
  • Maintenance triage: Property managers route after-hours work orders through a voice agent that can ask diagnostic questions, set priority, and dispatch a vendor without a human dispatcher.
  • Capital partner Q&A drills: Sponsors rehearse LP and lender calls with a voice agent that pushes back on assumptions, then transcribes the session for review.

Pricing Reality Check for CRE Operators

At $32 per 1M input audio tokens and $64 per 1M output audio tokens, GPT-Realtime-2 is not cheap on a raw token basis, but the unit economics for CRE workflows are favorable. A typical 5-minute leasing call costs in the low single-digit dollars, including translation if needed. A property manager handling 500 after-hours calls per month is looking at meaningful but predictable operating cost in exchange for clear NOI levers: faster lease-up, lower vacancy days, and reduced overtime for on-call staff. The relevant CRE math is not whether the API line item is small. It is whether the avoided vacancy days and avoided after-hours staffing exceed the API spend, and on most multifamily and industrial portfolios that calculation is comfortable.

How This Compares to GPT-5.5 Instant and ChatGPT for Excel

This release fits inside a broader OpenAI push in early May 2026. We covered the GPT-5.5 Instant default model swap on May 5, which cut high-stakes finance hallucinations by 52.5%, and the ChatGPT for Excel and Google Sheets launch on the same day, which brings GPT-5.5 directly into the underwriter's spreadsheet. GPT-Realtime-2 extends that push from text-and-data workflows into voice-first workflows. For sponsors and operators evaluating a wider AI stack, the comparison guides like ChatGPT vs Claude vs Gemini for multifamily tenant credit screening are a useful map of where to deploy each model.

Risks and Caveats

  • Compliance: Voice agents that handle PII, tenant screening, or income data still need to map to fair housing, FCRA, and state-level consumer protection rules. The model is faster than your policy review.
  • Recording consent: Two-party-consent states require explicit consent for call recording. The right pattern is consent at the top of the call, not after.
  • Hallucination on contract terms: Voice agents should not improvise lease terms, rent amounts, or commitments. Tool calls into a system of record are mandatory for anything material.
  • Fallback: When the agent says it cannot help, it must hand off to a human path quickly. Don't trap callers in a voice loop.

If you are evaluating where voice agents fit in your stack, The AI Consulting Network specializes in exactly this kind of build-versus-buy and risk diligence for CRE owners and brokers.

Real-World CRE Applications

Zillow's reported 26-point lift in call success rate is the most concrete real estate datapoint OpenAI has shared. On a 1,000-call-per-month leasing line, moving from 69% to 95% success is roughly 260 additional qualified conversations, which at typical multifamily conversion rates is a meaningful lease-up acceleration. According to CBRE and broader market data, only 5% of corporate occupiers report achieving most of their AI program goals despite 92% having initiated AI programs, and the AI in real estate market is projected to reach $1.3 trillion by 2030 with a 33.9% CAGR. Voice agents are one of the few AI workloads where the operational ROI is clean enough to actually close that 5% gap. CRE investors looking for hands-on AI implementation support can reach out to Avi Hacker, J.D. at The AI Consulting Network.

Frequently Asked Questions

Q: When did OpenAI release GPT-Realtime-2?

A: OpenAI announced GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper on May 7, 2026, with immediate availability in the OpenAI API and testing in the OpenAI Playground.

Q: How much does GPT-Realtime-2 cost?

A: GPT-Realtime-2 is priced at $32 per 1 million audio input tokens, $0.40 per 1 million cached input tokens, and $64 per 1 million audio output tokens. GPT-Realtime-Translate is $0.034 per minute and GPT-Realtime-Whisper is $0.017 per minute.

Q: What CRE workflows benefit most from GPT-Realtime-2?

A: Leasing call handling, multilingual tenant lines, after-hours maintenance triage, live broker tour transcription, and capital partner rehearsal sessions. The largest reported real estate result so far is Zillow's 26-point lift in call success rate.

Q: Can voice agents replace property management call centers?

A: Not entirely. Voice agents can handle 60 to 80% of routine leasing and maintenance triage in production deployments, but human escalation paths, fair housing compliance review, and recording-consent workflows remain mandatory.

Q: Where can I learn more about AI tools for property management?

A: Start with our pillar guide on AI property management tools, then review the GPT-5.5 Instant update and the ChatGPT for Excel and Google Sheets launch for the broader OpenAI stack.