Troubleshooting OpenSIPS script

troubleshootingWhat makes OpenSIPS such an attractive and powerful SIP solutions is its high level of programmability, thanks to its C-like configuration script. But once you get into the “programming” area, you will automatically need tools and skills for troubleshooting.

So here there are some some tips and tools you can use in OpenSIPS for “debugging” your configuration script.

Controlling the script logging

The easiest way to troubleshoot your script is of course by using the xlog() core function and print your own messages. Still the internal OpenSIPS logs (generated by the OpenSIPS code) do provide a lot of information about what OpenSIPS is doing.

The challenge with the logging is to control the amount and content of messages you want to log. Otherwise you will end up with huge piles of logs, completely impossible to read and follow.

By using the $log_level script variable you can dynamically change the logging level (to make it more or less verbose) from the script level. You can do this only for parts of the script:

log_level= -1 # errors only
…..
{
……
$log_level = 4; # set the debug level of the current process to DEBUG
uac_replace_from(….);
$log_level = NULL; # reset the log level of the current process to its default level
…….
}

or only for certain messages, based on source IP (you can use the permissions module for a more dynamic approach, controlled via DB)

if ($si==”11.22.33.44″)
$log_level = 4;

or some parts of the message (you can use the dialplan module for the dynamic approach):

if ($rU==”911″)
$log_level = 4;

IMPORTANT: do not forget to reset the log level back to default before terminating the script, otherwise that log level will be used by the current process for all the future messages it will handle.

Tracing the script execution

Still, using the xlog() core function may not be the best option as it implies a high script pollution and many script changes (with restarts, of course).

So, a better alternative is the script_trace() core function. Once you enabled the script tracing, OpenSIPS will start logging its steps though the script execution, printing each function that is called and its line in the script file.

This script tracing is really helpful when you want to understand or troubleshoot your script execution, answering questions like “why does my script not get to this route” or “why is the script function not called” or “I do not understand how the SIP message flows through my script“….. and many other similar problems.

The script_trace() function can help even more by allowing you to trace the value of certain variables (or parts of the message) during the script execution. Like “I do not understand where in script my RURI is changed“. So you simply attach to the function a log line (with variables, of course) that will be evaluated and printed for each function in the script:

script_trace( 1, “$rm from $si, ruri=$ru”, “me”);

will produce:

[line 578][me][module consume_credentials] -> (INVITE from 127.0.0.1 , ruri=sip:111211@opensips.org)
[line 581][me][core setsflag] -> (INVITE from 127.0.0.1 , ruri=sip:111211@opensips.org)
[line 583][me][assign equal] -> (INVITE from 127.0.0.1 , ruri=sip:111211@opensips.org)
[line 592][me][core if] -> (INVITE from 127.0.0.1 , ruri=sip:tester@opensips.org)
[line 585][me][module is_avp_set] -> (INVITE from 127.0.0.1 , ruri=sip:tester@opensips.org)
[line 589][me][core if] -> (INVITE from 127.0.0.1 , ruri=sip:tester@opensips.org)
[line 586][me][module is_method] -> (INVITE from 127.0.0.1 , ruri=sip:tester@opensips.org)
[line 587][me][module trace_dialog] -> (INVITE 127.0.0.1 , ruri=sip:tester@opensips.org)
[line 590][me][core setflag] -> (INVITE from 127.0.0.1 , ruri=sip:tester@opensips.org)

Again, you can enable the script tracing only for some cases (on demand). For example I want to trace only calls from a certain subscriber, so I can use the dialplan module and create in DB rules to match the subscribers I’m interested in tracing:

if ( dp_translate(“1″,”$fU/$var(foo)”) )
# caller must be traced according to dialplan 1
script_trace( 1, “$rm from $si, ruri=$ru”, “me”);

Benchmarking the script

Assuming that you get to a point where you managed to troubleshoot and fix your script in terms of the execution flow, you now may be interested in troubleshoot the script execution from the time perspective – how much time takes OpenSIPS to execute certain parts of the script.

This is a mandatory step if you want to perform a performance analysis of your OpenSIPS setup.

The benchmark module will help you to measure the time OpenSIPS took to execute different parts of the script:

bm_start_timer(“lookup-timer”);
lookup(“location”);
bm_log_timer(“lookup-timer”);

An interesting capability of the module is to provide information about the current usage, but also aggregated information from the past, like how many times a certain timer was used or what was the total time spent in a timer (see the full provided information). And even more, such information can be pulled via Management Interface via the bm_poll_results command, for external usage:

opensipsctl fifo bm_poll_results
register-timer
3/40/12/14/13.333333
9/204/12/97/22.666667
lookup-timer
3/21/7/7/7.000000
9/98/7/41/10.888889

Identifying script bottlenecks

But still you need to find the weak points of your script in terms on time to process. Or the bottlenecks of your script.

OpenSIPS provides a very useful mechanism for this – the time thresholds. There are different thresholds, for different operations, that can be set in OpenSIPS core and modules. Whenever the execution takes longer than configured, OpenSIPS will report it to the log, together with additional useful information (such as operation details or script backtrace):

  • exec_msg_threshold (core) – the maximum number of microseconds the processing of a SIP msg is expected to last. This is very useful to identify the “slow” functions in your overall script;
  • exec_dns_threshold (core) – the maximum number of microseconds a DNS query is expected to last. This is very useful to identify what are the “slow” DNS queries in your routing (covering also SRV, NAPTR queries too);
  • tcp_threshold (core) – maximum number of microseconds sending a TCP request is expected to last. This is useful to identify the “slow” TCP connections (in terms of sending data). Note that this option will cover all TCP-based transports, like TCP plain, TLS, WS, WSS, BIN or HEP.
  • exec_query_threshold (db_mysql module) – maximum number of microseconds for running a mysql query. This is useful to identify the “slow” query you may have. Note that this option covers all the mysql queries you may have in OpenSIPS – from script level or internally triggered by modules. A similar option is also in the db_postgres module too.

An example for the output trigger by the exec_msg_threshold :

opensips[17835]: WARNING:core:log_expiry: threshold exceeded : msg processing took too long – 223788 us.Source : BYE sip:……….
opensips[17835]: WARNING:core:log_expiry: #1 is a module action : match_dialog – 220329us – line 1146
opensips[17835]: WARNING:core:log_expiry: #2 is a module action : t_relay – 3370us – line 1574
opensips[17835]: WARNING:core:log_expiry: #3 is a module action : unforce_rtp_proxy – 3297us – line 1625
opensips[17835]: WARNING:core:log_expiry: #4 is a core action : 78 – 24us – line 1188
opensips[17835]: WARNING:core:log_expiry: #5 is a module action : subst_uri – 8us – line 1201

 

Good luck with the troubleshooting  and be sure your OpenSIPS rocks !!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s