Shortening the “answer delay” with early media
Posted: May 16th, 2012 | Author: Michael | Filed under: UCMA 3.0 | Tags: 183 Session Progress, clipping, delay, early media, RTP, UCMA | 2 Comments »It’s common for a UCMA app, before answering an incoming audio call, to perform some setup steps: finding an agent to take the call, preparing media, whatever. These steps may be necessary if, for example, the call is being set up as a back-to-back call and they can’t be done after the call is already established. In other cases, the way the call is set up makes the actual process of accepting the call take some time. Unfortunately, this can lead to callers hearing multiple seconds of ringing (or, even worse, silence) before the call is fully established. It can also lead to “clipping” effects where the beginning of a message or someone’s greeting is cut off because the media flow isn’t quite established yet. A feature called “early media” allows your application to get around these limitations and eliminate most of that delay or clipping.
The basic idea of early media is to allow endpoints to exchange media (RTP) packets before the SIP handshake to establish the call is completed. You can hear this in action on outbound calls from Lync to the PSTN – early media allows you to hear the actual ringing tones or failure tones from the PSTN, even though at that point the call isn’t established.
Without early media, the participants in a call can’t hear each other until the end of the SIP handshake, after the recipient responds with a 200 OK, as shown in the diagram below.

With early media, the recipient sends back a 183 Session Progress message, which contains the Session Description Protocol (SDP) information that the calling endpoints needs in order to route media to the recipient. The caller acknowledges the 183 with a PRACK message, and the media can then start flowing even though the call isn’t fully established, as the diagram below shows.

UCMA applications do not do this automatically. If you want your UCMA application to send the 183 Session Progress to establish early media, you can call the AudioVideoCall.BeginEstablishEarlyMedia method on an incoming call that has not yet been accepted. Assuming the early media setup succeeds, the AudioVideoFlow attached to the call will then become active, and you can play music, ring tones, or whatever you like to the caller while the application is still preparing to accept the call.
Here’s a code example. This example shows an event handler for incoming audio calls that establishes early media, waits 10 seconds, and then accepts the call. Meanwhile, when the AudioVideoFlow becomes available, it plays music to the caller.
void OnAVCallReceived(object sender,
    CallReceivedEventArgs<AudioVideoCall> e)
{
    e.Call.StateChanged +=
        new EventHandler<CallStateChangedEventArgs>(
            OnCallStateChanged);
    e.Call.AudioVideoFlowConfigurationRequested +=
        new EventHandler<AudioVideoFlowConfigurationRequestedEventArgs>(
        OnAVFlowConfigurationRequested);
    try
    {
        e.Call.BeginEstablishEarlyMedia(
            ar =>
            {
                try
                {
                    e.Call.EndEstablishEarlyMedia(ar);
                    Console.WriteLine("Early media established.");
                    Thread.Sleep(10000);
                    e.Call.BeginAccept(ar2 =>
                        {
                            try
	                        {
                                e.Call.EndAccept(ar2);
	                        }
	                        catch (RealTimeException ex)
	                        {
		                        Console.WriteLine(ex);
	                        }
                        },
                        null);
                }
                catch (RealTimeException ex)
                {
                    Console.WriteLine(ex);
                }
            },
            null);
    }
    catch (InvalidOperationException ex)
    {
        Console.WriteLine(ex);
    }
}
void OnAVFlowConfigurationRequested(object sender,
    AudioVideoFlowConfigurationRequestedEventArgs e)
{
    Console.WriteLine("flow config requested");
    e.Flow.StateChanged +=
        new EventHandler<MediaFlowStateChangedEventArgs>(
        OnFlowStateChanged);
}
void OnFlowStateChanged(object sender,
    MediaFlowStateChangedEventArgs e)
{
    Console.WriteLine("Flow state changed from {0} to {1}",
        e.PreviousState, e.State);
    if (e.State == MediaFlowState.Active)
    {
        AudioVideoFlow flow = sender as AudioVideoFlow;
        _player.AttachFlow(flow);
        _player.Start();
    }
}
void OnCallStateChanged(object sender, CallStateChangedEventArgs e)
{
    Console.WriteLine("Call state changed from {0} to {1}",
        e.PreviousState, e.State);
}
If users are experiencing annoying delays or clipping when calling your UCMA application, this may help. Please comment if you have any questions!
Early media works perfectly for Lync callers but some “qualified” PSTN gateways (e.g. AudioCodes Mediant) deliver ringback tone to PSTN callers instead of the early media played by the UCMA application. This issue is very annoying but hey this is SIP; the world of interoperability …. So please be careful with early media …
Thanks Csaba – good to know.