API upgrades enable enterprises to create smarter multimodal voice agents with PBX integration and remote tool access.
OpenAI has expanded the capabilities of its gpt-realtime API by adding support for the remote Model Context Protocol (MCP) Server and the Session Initiation Protocol (SIP). These new features are aimed at helping enterprises build more intelligent, autonomous, and multimodal voice-based agents. By integrating these protocols, developers can now extend AI-driven voice systems beyond basic transcription and conversation, enabling them to interact with external tools and seamlessly integrate with enterprise communication infrastructure.
With the introduction of remote MCP Server support, developers can connect their voice agents to external capabilities hosted on different servers rather than being restricted to local configurations. This allows for broader access to tools, applications, and data sources that can be dynamically pulled into AI-driven conversations. OpenAI explained that once connected, the API automatically manages the tool calls, eliminating the need for complex manual integration work. This design significantly lowers the barrier for enterprises to enhance their agents with specialized functions.
The addition of SIP support further strengthens the enterprise applicability of gpt-realtime. SIP, a long-established protocol for handling real-time voice communication over IP networks, enables direct integration of AI agents with PBX systems and phone networks. This means businesses can deploy AI-driven agents for customer-facing tasks such as call routing, appointment scheduling, or automated multilingual support within existing telephony infrastructures. Such integration is particularly valuable for contact centers seeking to reduce workloads while maintaining high-quality customer interactions.
By combining MCP and SIP support, OpenAI is positioning gpt-realtime as a powerful platform for building enterprise-grade voice solutions. From smarter call handling to dynamic access of remote tools, the update paves the way for enterprises to build autonomous, multimodal agents capable of handling diverse and complex workflows. This evolution highlights OpenAI’s continued focus on bridging advanced AI models with real-world enterprise applications, where seamless integration and extensibility are key to adoption.

