Media high availability/re-anchoring using OpenSIPS 3.2

Using a media relay server (such as RTPProxy, RTPEngine or MediaProxy) in your VoIP system is a rather common requirement due to various reasons: NATted clients media handling, legal compliance (recording) requirements or for offering enhanced services, etc. Thus, in order to provide high availability for your services, you also need to consider it for your media relays – after all, there’s no point in restoring a SIP call after a bunch of nodes crashes if you cannot restore its media, is there?

This article describes a simple, flexible and efficient method of achieving high availability for your calls’ media, regardless of the media relay server used, by re-anchoring a new media relay server in an ongoing call. This new technique is available starting with OpenSIPS 3.2, and does not have any requirements from the media relays used, nor the user agent clients – it relies on pure SIP mechanisms.

State of the art

By the time this article was written, among the media relays considered, only RTPEngine provided high availability capabilities for its media sessions, by using Redis key space notificatons features. While this is a production-ready solution, it had some disadvantages such as having to configure, monitor, maintain and ensure high availability for the Redis cluster as well. Moreover, you might need a VIP to be shared between the RTPEngine nodes to make sure RTP fails over properly when needed.

Our media re-anchoring solution presented in this post handles media fail-over at SIP signaling level, thus it does not rely on any implementation at the media server back-end.

Media re-anchoring

When a SIP call is established, one of the media relay servers available in the platform is chosen to proxy the RTP for the entire call. We say that the media is being anchored to that specific media relay, because the entire RTP flow goes through that specific media node, and in normal circumstances, there is no need of changing that. However, what happens if that node crashes? All the calls anchored to that server node will basically lose their media.

Media re-anchoring is the process of changing the initially chosen RTP media server of an existing/on-going call with a new, available media server node. This is done by simply sending both call’s participants an in-dialog re-INVITE message, each having the newly chosen RTP relay node advertised in their SDP body. This essentially means that the call’s media is re-anchored to a new RTP relay node.

There are several cases where media re-anchoring is useful:

  • Media server crashes – to replace the media path of ongoing calls whose media relay server is no longer available, thus ensuring high availability for calls media
  • Maintenance – you want to take off all calls of a media node in order to do some hardware/software maintenance – re-anchoring calls means you no longer have to wait for the server to drain (finish) all existing calls, you just move them all to a new server and you’re good to go
  • Load balancing – whenever you decide you need to scale up your platform and add a new set of media nodes, you can instantly re-balance the existing calls on all the available, thus resulting in an instantly balanced setup

RTP Relay

The newly added RTP Relay module in OpenSIPS 3.2 is the one that provides the primitives to easily anchor a media relay in a call, as well as the capabilities to do media re-anchoring when needed. The module is developed in a generic manner, independent on the Media Relay back-end used, providing only the logic to anchor/re-anchor a node – the actual communication with the Media server is handled by the specific module’s implementation – currently RTPProxy and RTPEngine modules have support for it, thus you will need to use at least one of them to benefit from this feature.

The new module provides a very simple interface for anchoring a Media Server in a call, for both initial and sequential updates of it. All you need to do is to call the rtp_relay_engage() function in your script, specifying the media relay engine you want to use (rtpproxy or rtprengine) and optionally the set of nodes to be used. From that point on, the module handles all the communication with the back-end nodes and updates the SDPs of all initial an sequential requests and replies. The module relies on the dialog module to keep track of the call’s state and generate in-dialog requests for this.

One can also tune the flags passed to the media servers using the $rtp_relay variable (for specifying caller’s flags) and $rtp_relay_peer (for specifying callee’s flags). These flags are passed along to the back-end module transparently, so make sure you provide the correct flags for the chosen engine. You can view the possible values in each module’s documentation page.

Besides being able to tune the flags passed along to the Media Server, the $rtp_relay variable can also be used to specify the type of the RTP to be used for a peer (using the $rtp_relay(type)), as well as the interface to be used (using $rtp_relay(iface)), or the IP advertised in SDP ($rtp_relay(ip)). You can find more information about their usage in the module’s documentation page.

Example

RTP anchoring

Script configuration is rather trivial as well: all you have to do is to load the module, the back-end module(s) and call the rtp_relay_engage() function (with optionally additional flags, if needed). Let us take as an example a scenario where we are using two RTPProxy nodes, and want to do media re-anchoring at a certain pint between them. For this, we need to load the necessary modules:

loadmodule "dialog.so"
loadmodule "rtp_relay.so"
loadmodule "rtpproxy.so"

In this example, we will use two RTPProxy nodes, each with its own IP: 10.11.11.11 and 10.12.12.12.

modparam("rtpproxy", "rtpproxy_sock", "udp:10.11.11.11:22222")
modparam("rtpproxy", "rtpproxy_sock", "udp:10.12.12.12:22222")

Now, all we need to do is engage the RTP relay node when a call starts. A short snippet that does this looks like this:

route {
    ...
    if (is_method("INVITE") && !has_totag()) {
        create_dialog();
        $rtp_relay = "co"; # check the RTPProxy documentation for
                           # the meaning of these (optional) flags
        $rtp_relay_peer = "co"; # do the same thing for the callee
        rtp_relay_engage("rtpproxy");
    }
    ...
}

This snippet will engage one of the RTPProxy nodes in the default set (0) in the current call. It will also modify the session’s c= and o= lines of the resulted SDP of both caller and callee. Note that you can also modify the flags of the callee when a reply is received, for example:

onreply_route[reply] {
    ...
    if (!nat_uac_test("8"))
        $rtp_relay += "r";
    ...
}

This will trust callee’s advertised address in SDP, if a public one is used.

And now we are all set – the call is started and when established, the traffic flows between caller and callee through one of the nodes. As an example, I’ve taken a screen shot of a test call I’ve been doing using this setup.

RTP re-anchoring

As you can see, after the call is established, all the traffic flows through the 10.11.11.11 node. However, 13 seconds after that, I decide that I want to switch the traffic to the other node – to do so, I’m running the following command in my cli:

# opensips-cli -x mi rtp_relay_update \
        engine=rtpproxy \
        set=0 \
        node=udp:10.11.11.11:22222 \
        new_node=udp:10.12.12.12:22222

The previous command sends an update for all the calls that are using the RTPProxy engine, node udp:10.11.11.11:22222 from set 0, re-anchoring them to a new node, udp:10.12.12.12:22222. Therefore, as you can see, after I run the command and the two re-INVITEs are sent to the participants, the entire media flow moved from 10.11.11.11 to 10.12.12.12. This is the power of media re-anchoring!

RTP Media engine change

But that is not all! Imagine that during a call, you decide that the conversation becomes sensitive, and you want to secure one of its legs (let’s say the caller). RTPProxy can’t convert from RTP to SRTP (yet), so you will have to swap to a new media relay engine that can do that. This is as simple as calling the rtp_relay_update_callid, providing the necessary details:

# opensips-cli -x mi rtp_relay_update_callid \
        callid=RANDOM_CALLID@10.0.0.7 \
        engine=rtpengine \
        flags='{"callee":{"type":"SRTP","flags":"replace-origin"}, \
                "caller":{"flags":"replace-origin"}}'

This command below re-anchors the call to a different engine (RTPEngine) and offers the callee a SRTP body, securing the callee leg conversation.

Conclusions

Media re-anchoring is a simple, yet powerful tool for ensuring media high availability for your services.

If you want to find out more about this topic, as well as about other interesting and useful tools the new OpenSIPS 3.2 provides, make sure you are not missing our annual OpenSIPS Summit Distributed 2021 happening on-line on 6-10 September 2021.

Leave a comment