
Running VoIP services in the cloud has become a common approach for providers seeking scalability and flexibility. A typical high-availability solution involves an active/standby pair of servers, where only one server is active while the others remain as warm backups, ready to take over at any moment.
Different cloud providers offer various mechanisms to implement this setup, ensuring that failover can occur automatically in case of failure or be triggered manually for maintenance purposes. A common method is using a floating IP that points to the active server, allowing seamless transitions by updating the IP mapping through API calls. Many cloud providers, including AWS, DigitalOcean, and Hetzner, support floating IPs natively. However, cloud environments that rely on virtualized network stacks, such as Microsoft Azure and Google Cloud Platform, do not provide this capability.
In this article, we’ll explore an alternative high-availability strategy for VoIP deployments in Microsoft Azure and how to achieve seamless failover without relying on floating IPs.
VRRP
To implement an active-standby high-availability setup, you need a mechanism to designate one node as active while continuously monitoring its health. In the event of a failure, a new node must be promoted to take over. This behavior is typically managed by tools such as Keepalived, Pacemaker, and Heartbeat, all of which rely on the VRRP (Virtual Router Redundancy Protocol) to monitor node status and ensure that only a single server remains active at any given time.
Under normal circumstances, VRRP operates over multicast networks. However, most cloud environments prohibit multicast traffic to prevent unnecessary network flooding. To work around this limitation, VRRP must run in unicast mode, allowing nodes to communicate directly without relying on multicast. This is where the Keepalived daemon becomes particularly useful, as it supports unicast VRRP, enabling reliable high availability even in cloud environments with restricted network configurations.
Azure Load Balancers
When a client tries to reach the platform, it typically uses a Fully Qualified Domain Name (FQDN), which ultimately resolves to an IP address—the platform’s entry point. In cloud environments that support floating IPs, the infrastructure automatically forwards traffic to the designated active node advertised through the API. However, since Azure does not support floating IPs, we need an alternative approach: leveraging Azure load balancers to manage traffic distribution and failover.

As shown in the picture, Azure Load Balancers act as entry-point IPs that distribute incoming traffic across multiple servers in an active-active manner. However, since our goal is to implement an active-backup setup, we need to modify this default behavior. To achieve this, let’s take a closer look at how Azure Load Balancers work and explore the necessary adjustments.
When configuring an Azure Load Balancer, you define a group of virtual machines (VMs) capable of handling incoming traffic. Each node is assigned an HTTP probe, which periodically checks its availability. This built-in health check mechanism is exactly what we’ll leverage: instead of probing a generic service, we’ll configure the HTTP endpoint to query a small HTTP server that monitors Keepalived. This server will determine whether the node is active—returning 200 OK if it is, or 503 Service Unavailable if it is not.
Therefore, by using Keepalived to track the active node and exposing this information to the Azure Load Balancer’s health probe, we ensure that traffic is always directed to one node, the active/healthy.
Limitations
This solution is theoretically sound, but in practice, things aren’t quite that simple. Probing cannot happen continuously. In Azure, the minimum probe interval is 5 seconds, and the minimum number of unhealthy responses required to trigger a failover is 2. This means that in order for failover to occur, it could take at least 10 seconds, maximum 15 seconds —a delay that might be problematic for some applications, especially those requiring near-instantaneous failover.
Moreover, probing is not instantaneous across all nodes. As a result, there’s a possibility that Azure could probe a node just before it crashes, marking it as active, and then probe another node and find it active as well (due to a failover). This creates a small window—roughly 10 seconds—during which the platform could have two (or potentially more) active nodes. To prevent issues, you need to ensure that your service is capable of handling this at the application layer. Specifically, the backup node should be configured to return a 503 Service Unavailable if traffic is inadvertently routed to it during this brief period.
Conclusions
In conclusion, it’s essential to evaluate the cloud provider early in the design stages of your project, as it can significantly influence the way you architect your product. Cloud environments like Azure, which lack certain features like floating IPs, require alternative strategies for high availability, such as leveraging load balancers and custom health checks. Understanding these nuances upfront will help ensure that your system remains resilient, reliable, and capable of meeting your uptime requirements.

Thanks to Denys Pozniak for the insights! After publishing this article, we realized that Microsoft Azure actually introduced floating IPs in November 2024. You can read more about it here: https://learn.microsoft.com/en-us/azure/load-balancer/load-balancer-floating-ip.
Nevertheless, this article remains relevant for other cloud providers, such as Google Cloud Platform (as of now), which currently does not offer floating IPs natively.
LikeLike