Clustering Engine Improvements in OpenSIPS 3.2

The ability to implement highly-available, scalable distributed SIP services has long been an important objective for OpenSIPS. A big leap in this direction was made in version 2.4 with a major rework of the clusterer module and the introduction of many enhancements and new clustering features. Since then, each new major release has continued to come up with improvements and new capabilities for the clustering support. The new OpenSIPS 3.2 version is no different in this regard, with a focus on securing and better managing an OpenSIPS cluster.

TLS support

OpenSIPS 3.2 introduces a TLS variant for the TCP-based binary protocol used for intra-cluster communication between OpenSIPS instances, in the form of a new proto_bins module. This will open the possibility of building secured OpenSIPS clusters over the public Internet in situations where VPNs or private networks are not an option. By using TLS we may also automatically ensure a level of authentication for nodes that dynamically join an OpenSIPS cluster.

In terms of configuration, we simply need to change the BIN URLs of the instances to “bins:” (listening socket and the URLs of the other nodes in the database) and load the new proto_bins module.

At the time of writing this article, the proto_bins module works only on top of the TLS implementation provided by the new wolfssl module. But once the work for 3.2 on the dual openssl/wolfssl implementations support is done, this will no longer be a requirement and the TLS domains and settings will be available through the tls_mgm module.

Cluster management

Another set of improvements for the clustering engine in OpenSIPS 3.2 aims to provide better control and management over the cluster topology and replication capabilities.

Restricted node joining

Previously, nodes were able to freely join an existing OpenSIPS cluster, without the possibility to restrict the accepted nodes in any way, as long as the new instances were properly configured. In consequence, when database provisioning is used, nodes are now accepted in the cluster only if they are actually defined in the database in advance.

Besides security concerns, this stricter approach is also helpful when trying to kick a node out of the cluster. This would previously be problematic, as a node intended to be removed from the cluster could have been rediscovered by the time all nodes would complete a database reload via MI.

Removing nodes at runtime

In a cluster configured to dynamically discover the topology at runtime, prior to OpenSIPS 3.2, there was no convenient way to remove specific nodes from the topology at runtime (besides a full restart of the entire cluster). For this purpose, we can now use the new clusterer_remove_node MI function.

Disabling nodes and capabilities

While doing maintenance in a platform or debugging clustering issues, it is often times desirable to be able to temporarily disable a specific node in the cluster, or turn off data replication, without shutting down the OpenSIPS instance or altering the provisioned topology.

In order to improve the control over the status of a node, the clusterer_set_status MI function has been extended in OpenSIPS 3.2, to allow disabling/enabling all communication with a specific node. Previously, this function controlled only the behavior of the local instance in relation to all of its neighbors.

In addition, the 3.2 version provides a new clusterer_set_cap_status MI function, that allows disabling communication for a specific clustering capability (eg. dialog replication, user location clustering etc.).

Data sync per sharing tag

As the underlying clustering layer is responsible for providing the data synchronization mechanisms for higher-level capabilities, OpenSIPS 3.2 also brings some improvements in this department. Modules now have the means of syncing data associated with specific sharing tags, based on the tag names or states. This is useful in order to avoid pulling out-of-date information from other nodes (eg. syncing from a donor node that is in backup state).

Dialog replication

At the moment, only the dialog replication functionality makes use of this improved mechanism. A specific sharing tag can now be provided to the dlg_cluster_sync MI function. Also, in OpenSIPS 3.2, the dialog module has the ability to automatically issue a sync request, when any node in the cluster becomes reachable. Only the dialogs marked with sharing tags in backup state will be synced.

With the new improvements in the clustering support, the upcoming 3.2 release will allow you to build more manageable and secure OpenSIPS clusters.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s