No Downtime for OpenSIPS 3.0 restarts

No_Downtime-e1470746472235Doing maintenance on live production servers is never desired, but it is always necessary in order to enhance your platform with new features, or just for doing some small/quick fixes. And maintenance usually requires restarting OpenSIPS, so that it can re-parse its configuration script.

Depending on your setup, OpenSIPS may need, for routing purposes, a large amount of data (such as for example the dynamic routing rules). Such data is store in database and cached during OpenSIPS startup. And this raises a problem with restarts, as loading and caching this data takes a lot of time for OpenSIPS. Therefore, even though after a restart OpenSIPS can run instantaneously, in practice it cannot handle any new calls until the entire data is re-cached in memory. And this obviously translates in service’s downtime.

Restart Persistent Cache

This is the reason why, in OpenSIPS 3.0, we have developed a new mechanism that improves the restart efficiency by persisting OpenSIPS memory during restarts. This new mechanism works as a restart cache: OpenSIPS internal memory is no longer released before shutting down, and after OpenSIPS starts up, the database is no longer queried to load the data, since everything is already in there. Therefore, restarts can now happen with almost no downtime!

One of the most expensive module in terms of startup delay time is the Dynamic Routing module, due to its huge amount of prefix-based rules. Since usually this is the most painful functionality, we first targeted the drouting module to test our feature and prove its efficiency.

Dynamic Routing Benchmark

To test this feature, we took a couple of measures of time and memory used with  and without the new persistent memory feature. For our tests we used the following drouting data set:

  • 40 carriers
  • 120 gateways
  • 5M rules
  • 20 groups

All the following tests were done on commodity hardware, on an Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz, with 16 GB of RAM and an SSD drive. OpenSIPS was configured to use 4 GB of shared memory, and 4 GB of restart persistent/cache memory.

Loading data from database, no cache

To load the data set in memory and index it, OpenSIPS consumed:

  • Time: 65 seconds
  • Shared memory usage: 1.4 GB
  • Max shared memory usage: 1.4 GB

After reloading the data of the MI command, the following measurements were noticed:

  • Time: 60 seconds
  • Shared memory usage: 1.9 GB
  • Max shared memory usage: 2.8 GB

As expected, the maximum amount of shared memory usage is twice the amount initially used (1.4 GB), since we are using the same data. This can of course fluctuate if we add more data, or remove data from the database.

Loading data from database, with cache

Restart persistence assumes that instead of loading the data from the database in memory each time, instead we only load it once and then store it in a local file that is mapped against OpenSIPS memory. The values measured after adding restart persistence cache is:

  • Time: 64 seconds
  • Restart persistence memory usage: 1.4 GB
  • Max restart persistence memory usage: 1.4 GB

After reloading the data, the following measurements were indicated:

  • Time: 62 seconds
  • Restart persistence memory usage: 1.9 GB
  • Max restart persistence memory usage: 2.8 GB

We can easily notice that adding cache persistence does not add any penalty in loading data, nor in shared memory usage.

Loading data from cache

After the data has been loaded from the database in OpenSIPS’ cache, after restart OpenSIPS uses the cache directly. The load time of the cache file in memory is instantaneous, therefore data is immediately available. Memory usage is similar to the other tests, since nothing extra is allocated.

Dynamic Routing Restart Persistence Configuration

The simplest configuration for this feature is to enable the restart persistency feature in the drouting module, using the enable_restart_persistency module parameter:

modparam("drouting", "enable_restart_persistency", yes)

To make this work properly, you might also need to tune the size of the restart persistent memory. You can do that by tuning the restart_persistency_size parameter:

restart_persistency_size = 4096

Note that if you do not set this parameter, it will inherit the size of the shared memory used.

And that’s it, you’ve got zero reload time for your OpenSIPS that uses millions of dynamic routing rules!

Conclusion

In this article we argued that restart persistent storage is a very powerful mechanism for preventing OpenSIPS downtime caused by restarts, that has no other penalties. We have proved this by benchmarking one of the most troublesome modules, the dynamic routing module, and managed to get it started with 5M records in no time!

Join us at the Amsterdam 2019 OpenSIPS Summit and Training to find out more interesting features about OpenSIPS 3.0!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s