200 OK When will my SBC Be Out of Gas? Predicting Signaling Element Maximum Capacity


When will my Session Border Controller be "full"? The same question applies to any other signaling element in a VoIP network: how much more work can I put into this device before it's overloaded?

In this discussion, we'll focus on the Acme Packet OS-C Session Border Controller, such as the NN4250 or NN3820. The typical constraints people bump into are CPU usage and session licenses.

Predicting Processor Overload

For CPU utilization, the key question is to predict when your CPU will be at the peak. Above that peak, you'll start dropping calls or having other problems. So what's the right peak?

Acme Packet has stated that that when the CPU is somewhere between 80% CPU and 90% CPU, the SD will postpone certain maintenance tasks -- such as session state replication. If the CPU sustains a high load for long enough, then the two SDs in a pair can lose synchronization. So we try to ensure that the peak CPU is no greater than 80%.

Simple Arithmetic For Predicting

So the question for capacity planning is: how much of my CURRENT type of workload can the SD support, and not exceed 80% CPU? Here's one rough prediction method that we've had success with:


maxAllowableCpu <- 0.8 currentInvitesPS <- (Server INVITE requests Recent) / (Recent Period duration in seconds) currentSipCpu <- ("show process cpu all" tSipd Avg value) * 0.01 predictedInvitesPS <- maxAllowableCpu/currentSipCpu * currentInvitesPS

For example: we're allowing the CPU to grow to 80% utilization. Suppose that you're observing 1260 INVITES over the past 90 seconds. That would mean current INVITES per second is 14. Suppose also that your tSipd CPU load is 50%. That would make your predictedInvitesPS = (0.8 / 0.5) * 14 = 22 Invites Per Second.

I'm assuming that the CPU is dominated by SIP traffic. You'd need a more complex model if you're doing a lot of other CPU work.

Better Than A Spreadsheet

What's great about this model is this it expects linear growth of all the other activity related to your system. Service Providers vary WILDLY in the amount of NOTIFY traffic they support, for features like Busy Lamp Field (BLF). If your BLF usage grows with your INVITES per second, then the INVITES per second can give you a sense for your existing headroom.

Another great feature is that this model accommodates all of your existing SIP Header Manipulation Rules. One SP may have simple NAT_IP HMRs, while another one is doing a total rewrite of large XML SIP messages traversing. So SP's can very a lot in the CPU load. This model assumes that your same HMR will be applied to your future growth just as it is used right now.

Adapting for Customer Count rather than Call Count

You could also model the maximum registered customers, by using the current number of registered users instead of INVITE volume.

A Proven Model

This model has been proven out several times across several customers in the US and abroad. It's been very helpful for predicting growth requirements and getting more SD's on order. And, in some cases, it has helped service providers to optimize their SBC CPU utilization to delay another purchase of another SBC.

Session Counts

Determining when your session licenses or capacity is typically more straightforward, because the resource is less flexible. If an SBC has been licensed for 1000 sessions, then it's easy to determine how many of those sessions are in use right now. You can use the MIB or command-line monitoring. If you need more licenses at peak, then buy more.