The CODEC Translator: VoIP network management

Showing posts with label VoIP network management. Show all posts

Wednesday, April 11, 2012

Solutions for SIP NAT/Firewall Traversal for VoIP

SIP-based communication does not reach users on the local area network (LAN) behind firewalls and Network Address Translation (NAT) routers automatically. Firewalls are designed to prevent inbound unknown communications and NAT stops users on a LAN from being addressed. Firewalls are almost always combined with NAT and typically still do not support the SIP protocol properly.

The issue of SIP traffic not traversing enterprise firewalls or NAT is critical to any VoIP implementation. At some point, all firewalls will need to be SIP capable in order to support the wide-scale deployment of enterprise person-to-person communications. However, in the short term, several solutions have been proposed to work around the firewall/NAT traversal problem. The bad news is that many of these solutions have serious security implications. The good news: there are other solutions that allow you to remain in control to various degrees. It is important to consider to what level you are prepared to surrender the control of your corporate or carrier infrastructure when choosing a NAT/firewall traversal solution in your network.
As you envision your solution, you should consider these questions:

Who should be in control of my security infrastructure: the firewall administrator, the user or a peering service provider?
Do we want a solution that is predictable and functions reliably with SIP standard compliant equipment or is it sufficient with a best effort solution that works in certain scenarios and maybe only with specific operators?

Universal Plug-and-Play (UPnP) – The SIP client or Windowsis in control

Universal Plug-and-Play (UPnP) for NAT control allows Microsoft Windows or a UPnP-capable SIP client to take control of the firewall. Both the client and firewall must support UPnP. This is a viable alternative only for those that can be sure there will never be anything malevolent on the LAN. UPnP is only supported by few firewalls and SIP clients. Due to the inherent high security risk in allowing a third party software to take control of the firewall this method is rarely used and in practice only for home users.

STUN, TURN, ICE – The SIP client is in control

These are all protocols proposed by the IETF for solving the firewall/NAT traversal issue with intelligence in the clients together with external servers. With these methods, pinholes are created in the NAT/firewall for SIP signaling and media to pass through. It is also the responsibility of the SIP client to emulate what the protocol should have looked like outside the firewall. These methods assume certain behavior from the NAT/firewall and cannot work in all scenarios. In addition, they remove control from the firewall, which must be sufficiently open to allow the users to create the necessary pinholes.

Session Border Controllers at Service Provider – The service provider is in control

Most service providers use some sort of session border controller (SBC) in their core network to perform a number of tasks related to their SIP services. One of these tasks is to make sure that the SIP services can be delivered to their customers. They may use STUN, TURN, ICE for this by acting as a server component for these protocols. However, not all clients support these protocols so the SBC may also use far-end-NAT traversal (FENT) technology for NAT traversal. The FENT function will aid remote SIP clients by transforming any SIP message by rewriting all relevant information and relay media, as well as keeping the client on the NATed network reachable. This solution only works with firewalls that are open from the inside, and may not work with all equipment and in all call scenarios. FENT is best suited for road warriors working at a hotel or at a conference, rather than at fixed location where there are more reliable and secure solutions. FENT also removes control from the firewall, which must be sufficiently open to allow FENT from the service provider SBC to work.

SIP-capable Firewalls or enterprise SBC – The firewall administrator is in control

This is a long-term solution where the problem is solved where it occurs, at the firewall or in tandem with an existing firewall using an enterprise session border controller. When deployed at the enterprise edge the SBC offers the same security and control as it does for the service provider's core network. The enterprise SBC typically has a built-in SIP proxy and/or back-to-back user agent (B2BUA) functionality to give unparalleled flexibility in real-life enterprise deployments. Most vendors of SBCs for service providers have products that can be deployed at the enterprise and then there are other companies that have developed products for the enterprise market from the very beginning.

For an enterprise, there are special security and functional requirements that make the SIP-capable firewall or enterprise SBC the solution of choice. First, it is the only solution that allows the firewall to maintain control of what is traversed between the LAN and the outside world. In addition, it is becoming more and more common to have a SIP server on the LAN. In fact, all SIP-based IP PBXs are SIP servers. In order for these SIP servers to communicate over IP with the outside world, the firewall simply must be SIP-enabled. As many IP-PBXs have done their own SIP-extensions outside the SIP standard it is very important that the firewall or enterprise SBC be adapted to support these extensions.

Due to the complexity of real-life installations, it is highly recommended that you deploy specialized SIP-enabling devices such as SIP-proxy firewalls even in the SOHO and SMB markets, even if it might not be motivated from a security policy perspective.

With the explosion of these various solutions and protocols, comes another potential headache. What do I do if something goes wrong? How do I find the offending service or equipment if jitter occurs or my users start experiencing one way audio? What is causing echo on the conversation? Before considering a large scale enterprise or service provider deployment, you should also consider a robust troubleshooting package. You might determine that additional test equipment is needed, but you will likely find that hosted troubleshooting packages are much more cost effective and future-proof.

Wednesday, April 4, 2012

Packet Voice Quality Problems and Answers

Almost all packet-based voice quality issues are attributable to some type of degradation on the packet network that the voice traffic's RTP stream traverses. Voice traffic brings to light network problems that might otherwise go unnoticed when just carrying normal data traffic. This is because in voice compressor-decompressors (CODECs) packet loss and variable delay in the IP telephony network must be minimized. Let's explore some common network issues that result in poor voice quality and what you can do about it:

Packet Drops

Packet-based telephony demands that speech packets find their destination within a predictable amount of time. There is very little tolerance for them to be dropped somewhere along the way from the source to the destination. In a properly designed network with Quality of Service (QoS) provisioning in place, packet loss should be nearly zero. All voice CODECs can tolerate degrees of packet loss without adversely affecting voice quality. Upon detecting a missing packet, the CODEC decoder on the receiving device makes a best guess as to what the waveform during the missing period of time should have been. Most CODECs can tolerate up to five percent random packet loss without noticeable voice quality degradation. This assumes that the five percent of packets being lost are not being lost at the same time, but rather are randomly dropped in groups of one or two packets. However, losing multiple simultaneous packets, even as a low percentage of total packets, can cause noticeable voice quality problems.

Note: You should design your network for zero packet loss for packets that are tagged as voice packets. A converged voice/data network should be engineered to ensure that only a specific number of calls are allowed over a limited-bandwidth link. You should guarantee the bandwidth for those calls by giving priority treatment to voice traffic over all other traffic.

There are various tools and procedures that you can use to determine whether you are experiencing packet loss in your network and where in the network the packets are getting dropped.

1. Examine IP phone statistics:

If you are troubleshooting at the phone experiencing the problem, access the phone statistics by pressing the help (i or ?) button on the IP phone twice in quick succession during an active call.
If you are working with a remote user, open a web browser on your computer and enter the IP address of the user's phone. During an active call, choose the Streaming Statistics > Stream 1 options from the display.

2. Examine the counters RxDisc and RxLost shown on the IP phone (or Rcvr Lost Packets if you are viewing the statistics remotely using a web browser).

RxLost measures the number of packets that were never received because they were dropped in the network somewhere. By detecting a missing RTP sequence number, the IP phone can determine that a packet has been lost.
RxDisc corresponds to packets that were received but were discarded because they could not be used at the time they arrived. RxDisc can come from an out-of-order packet or a packet that arrived too late.

3. If either of these two counters increments, you should investigate to learn why packets are being lost or discarded.

Regardless of how low your packet loss is, if it is not zero, you should investigate the root cause. It just might be a sign of a bigger problem that will get worse with higher call volume. Always remember that packet loss can be occurring at any layer of the OSI-like transmission model, so be sure to check for all layers possible in each hop. For example, if there is a Frame Relay connection over a T1 between two sites, you should:

Make certain that there are no errors at the physical layer on the T1.
Determine if you are exceeding your committed information rate (CIR) on the Frame Relay connection.
Verify that you are not dropping the packets at the IP layer because you are exceeding your buffer sizes.
Check that you have your QoS improperly configured.
Ensure that your service provider not only guarantees packet delivery but also guarantees a low-jitter link. Some service providers may tell you that they do not provide a CIR but guarantee that they will not drop any packets. In a voice environment, delay is as important as packet loss. Many service providers' switches can buffer a large amount of data, thereby causing a large amount of jitter.

Another common cause of drops in an Ethernet environment is a duplex mismatch, when one side of a connection is set to full duplex and the other side is set to half duplex. To determine if this is the case, perform the following steps:

Check all the switch ports through which a given call must travel and ensure that there are no alignment or frame check sequence (FCS) errors. Poor cabling or connectors can also contribute to such errors; however, duplex mismatches are a far more common cause of this kind of problem.
Examine each link between the two endpoints that are experiencing packet loss and verify that the speed and duplex settings match on either side.

Although duplex mismatches are responsible for a large number of packet loss problems, there are many other opportunities for packet loss in other places in the network as well. When voice traffic must traverse a WAN, there are several places to look. First, check each interface between the two endpoints, and look for packet loss. If you are seeing dropped packets on any interface, there is a good chance that you are oversubscribing the link. This could also be indicative of some other traffic that you are not expecting on your network. The best solution in this case is to take a trace to examine which traffic is congesting the link.

Hosted/Web-based analysis tools are invaluable in troubleshooting voice quality problems. With these tools, you can examine each packet in an RTP stream to see if packets are really being lost and where in the network they are being lost. With such a tool, perform the following steps:

Start at the endpoint that is experiencing the poor-quality audio where you suspect packet loss.
Take a trace of a poor-quality call and filter it so that it shows you only packets from the far end to the endpoint that is hearing the problem. The packets should be equally spaced, and the sequence numbers should be consecutive with no gaps.
If you are seeing all the packets in the trace, continue taking traces after each hop until you get a trace where packets are missing.
When you have isolated the point in the network where the packet loss is occurring, look for any counters on that device that might indicate where the packets are being lost.

Queuing Problems

Queuing delay can be a significant contributor to variable delay (jitter). When you have too much jitter end-to-end, you encounter voice quality problems. A voice sample that is delayed over the size of the receiving device's jitter buffer is no better than a packet that is dropped in the network because the delay still causes a noticeable break in the audio stream. In fact, high jitter is actually worse than a small amount of packet loss because most CODECs can compensate for small amounts of packet loss. The only way to compensate for high jitter is to make the jitter buffer larger, but as the jitter buffer gets larger, the voice stream is delayed longer in the jitter buffer. If the jitter buffer gets large enough such that the end-to-end delay is more than 200ms, the two parties on the call feel like the conversation is not interactive and start talking over each other.

Remember that every network device between the two endpoints involved in a call (switches, routers, firewalls, and so on) is a potential source of queuing or buffering delays. The ideal way to troubleshoot a problem in which the symptoms point to delayed or jittered packets is to use a hosted diagnostic tool at each network hop to see where the delay or jitter is being introduced.

Also remember that once you've made changes to your network elements and call paths that it is always appropriate to perform regression testing across your network with some type of automated voice quality test package.

Saturday, March 3, 2012

MOS Results are Good but Customers Still Complain of Poor Speech Quality

When monitoring the speech packets of a VoIP network, i.e. the RTP stream, protocol analyzers or VoIP service monitoring systems may analyze the packets and calculate a Mean opinion score (MOS) which describes the quality of the speech. If customers call into your call center or NOC and complain about speech quality this is the first place you would go to analyze the problem. Finding the bad call the customer is referring to will take a few seconds with a good voice service assurance system. Sometimes when you find the call, the MOS maybe 4.0 or greater indicating good speech quality.

How can we explain this discrepancy?

The method used to calculate MOS by a monitoring system which needs to monitor thousands of active calls and produce voice quality measurements or QoS of all of them is R-factor based on the E-model (ITU-T G.107). R-factor takes into account the codecs used by the endpoints (VoIP Phones) in the call. But most of the calculation is derived from what happens to the packet stream as it transits an IP packet network. So R-factor requires measurements for packet loss and jitter of the packets in the stream of VoIP (RTP) packets

How is packet loss and jitter detected in an RTP stream? The RTP packets contain a sequence number and a time stamp and from this packet loss and jitter can be calculated. It is however, important to understand that these measurements can only be made on the received RTP streams i.e. packet loss and jitter is only measured up to that point in the network where the monitoring system probe is connected.

RTP streams leaving your network going to your customer’s premises will not be measured unless you have a probe on the customer’s premises or their VoIP phones support RTCP. A sophisticated good voice service assurance system will also read the RTCP packets coming back from the customer’s probes and so that leg of the call can be included in the R factor measurement. Some monitoring systems will show you in which leg of the call the packet impairments are added.

If call quality is bad but MOS is good, perhaps you are not monitoring all legs of the call.

Another reason may account for this discrepancy. MOS or Perceptual Speech Quality does not include echo or overall end-to-end delay. So if the customer is experiencing echo, this will not degrade the MOS values. Similarly, if there is delay in the network (as opposed to jitter, which is short-term varying delay), this will not impact the MOS value. Network Delay in VoIP causes cold or stinted conversation, so this this could be a reason for your customer to call you.

If customers complain of poor speech quality but the MOS results are good, use your voice service assurance system to record the calls from this customers (with disclaimers and forwarding necessary of course) and listen to a sample for echo or delay.

Wednesday, January 25, 2012

The One-Way Audio VoIP Dilemma

One-way audio is an annoying anomaly where one person can hear the other caller, but that caller can't hear them. It is uncommon in non-VoIP telephony, but seems to occur frequently in systems that have just been deployed.

In traditional TDM voice transmissions, the circuits are reserved for two-way voice transmission, but in VoIP each voice stream is independent. Should one of the streams be lost, deferred, or misdirected, the call the results on has one active stream in one direction.

The most common reason for one-way audio is the result of improper negotiation of the RTP voice stream by a server that is behind a firewall - an unaccommodating firewall.

In a future post we will talk about how to isolate the offending firewall and how to work with the system or firewall administrator to set NAT parameters appropriately so that both RTP streams can be negotiated successfully.

Friday, November 18, 2011

The True Customer Experience of a Voice Service

Customers perceive the quality of a voice service and the reliability of a voice service based on many different experiences. Such experiences may be good or bad, but very often customers notice only the bad experience – obviously as they buy a service that is meant to work 100% at any time and from anywhere.

Bad experiences are often caused by network related problems:

• Calls with degraded voice quality
• Interrupted or Dropped calls or VoIP Dropped calls
• Unsuccessful call attempts
• Missed calls that did not ring or calls that were not signaled

Also end device related problems bring negative user experience:

• Empty battery on smartphone
• End device & VoIP Endpoint crashes
• Inability to setup 3 party conference call

For all voice operators, over-the-top (OTT) and legacy networks, these end-device related troubles are very important. Because in the end the customer does not care at all why something is not working! He will blame it on his voice service provider anyway.

As a result when considering the management of customer experience in next generation voice networks, it is very important to look beyond your own network infrastructure. Internet connections, utilization and network problems on the customers’ corporate LAN and end devices cause a large portion of the problems. So it would definitively not be a good idea to consider a service running well, simply because in one’s own network, all systems are up and running.

Instead operators also need to manage:

• End device & VoIP Endpoint firmware versions
• App/softphone software versions
• IP link quality characteristics
• Performance and problems at your Customers’ premises
• Error logs from the end device

Very often problems can be solved by optimizing configurations, codec settings, centralized firmware updates or pro-actively contacting the customer about up-coming problems with his device or software. So a tool is needed that see the endpoint or VoIP device as well as the network. But for all of that, a customer experience management system must provide the full set of information about how the customer truly perceives the service End-to-End.

When looking at customer experience management for VoIP networks, operators should choose a holistic approach and carefully select software solutions that are designed to look far beyond the usual scope and take care about the most important stakeholder in this game – the customer.

Sunday, November 6, 2011

Visualizing a VOIP Security Attack

Visualizing a cyber attack on a VOIP server from Ben Reardon, Dataviz Australia on Vimeo.
Prevent a VoIP security attack on your systems!

Friday, October 21, 2011

Network Monitoring - It's Your Livelihood

What are your daily duties as the VoIP network operations director? You have to keep your network up and running. You have to answer calls which may relate to situations like “X location is down” or “Y location is slow”. My we suggest that you proactively monitor your network as described below and perform tasks like:

Monitor your network and take actions with respect to situations like device and line failures.
Analyze line/physical facility utilization, errors on the facility and be sure about network performance and conformance to SLAs.
Be aware of what "talks to what" and when? Be sure how much bandwidth is needed for every single application riding your network (and the networks you traverse.)
Know your exact data flows over your networks.

If you have all this information at your disposal, people will think twice before they point finger at you.

But how can you achieve this?

You need a phased approach to understand network monitoring. I am not talking about network layers, but network monitoring layers. We have to involve deeply to monitoring layers before deciding about network monitoring software needs. A simple summary could include these:

Preconditions of network monitoring.

Up/Down monitoring

Performance Monitoring / SNMP monitoring

Who talks with whom? / Netflow monitoring

Data capture / Data sniffing

Preconditions of Network Monitoring
Network documentation is essential to monitor a network. Trying to set up network monitoring tools before going through the documentation is a complete waste of time. You will see everything green on the screen, but this maybe due to one of the redundant lines that are down. You will sit staring without knowing what is happening. Always remember, documentation comes first and everything follows.Suggested Network documentation tools: Powerpoint/Visio, NetViz

Up/Down monitoring
Design a map in which you can see some red and green lights glowing. Green means up and red means down. It is simple yet powerful. You will immediately come to know that there is some problem if the red light glows.This is based on ping. Almost every IP devices support echo/echo reply. So, you can monitor all IP devices in your network by using ping. Go one step further by monitoring one application at a time present on a device instead of whole device. All network applications utilize TCP/UDP ports. You can monitor the applications by trying to access with telnet to its TCP/UDP ports. The port being open suggests that the application is running
Suggested monitoring tools: WhatsupGold, nmap

Performance monitoring / SNMP monitoring
The lines are up, the devices are up, but life is not perfect. People may complain about the performance of data lines, but are they saturated or do they have plenty of spare bandwidth? Is there packet loss on the lines? Are routers running out of memory? We need SNMP to monitor the heart beat of the network.
Suggested monitoring tools: MRTG, Solarwinds Orion, PRTG

What is "talking" with what? / Netflow monitoring
You may realize that the line is full, but is someone or some applications increasing traffic load enormously. Who are they? Is it necessary traffic? In some devices, by using “ip accounting” command you can get an idea of current traffic sources and destinations. Nevertheless, to analyze and to optimize the traffic we need flow monitoring. We need to know source and destination IP addresses and TCP/UDP ports and number of packages/bytes.

Everyone blames the network speed until you publish an network usage report that clearly shows only 15% of the traffic is ERP traffic and rest comes from Internet access.You should know that flow monitoring tools requires more server resources, since they collect enormous amount of data.
Suggested monitoring tools: Fluke Netflow monitor, Paasler

Data capture / RMON – Sniffer tools
Sometimes you need to observe the exact data flow on the line and not just information about it. Just have a look at this sample scenario. After you find out that the web service causes inappropriately high network traffic, the owner of the application just can say “No, we are not pushing this much of data to network. We just respond Yes or No in this web service and it is just 100 bytes”. Therefore, you should sniff the data flow on the line. Maybe, you will find that web service responds yes or no (100 bytes) and with the definition of web service (6 kilobytes).
Suggested monitoring tools: Wireshark, Palladion

You can have a look at Network Monitoring Tools in Stanford University web site for a superb list of network monitoring tools. You can find another tidy list at Network Traffic Monitoring in Alan Kennington’s topology.org.

Thursday, October 13, 2011

11 Million Euro Loss - 1 Million in Profits - VoIP Fraud

Recent Voice over IP fraud attackers made over 1 million Euro in profits.
It seems that they simply were scanning for PBX servers with phone extensions that have weak passwords. Then they abused these accounts to make phone calls for "free", except that free has the price of 11 million EUR for the service provider victims!

Apparently, originally they used these accounts for their own personal phone calls. However they got greedy and between October 2009 to February 2010, they made 23500 calls / 315000 minutes to premium numbers. Then (from what I understood), they got even more greedy and used a Shadow Communication Company and prices for premium numbers that then linked to another site further obfuscating their "business." Using this scheme they recruited other people to make 1,541,187 fraudulent calls or 11,094,167 minutes of talk time.

They used other premium numbers affiliate networks at first and then, when they realized the potential, they set up a company in the UK - Shadow Communications Inc. - through which they were able to sign a contract on their own with a premium rate number provider and offer their own affiliates service, basically taking their "business" to a whole new level.

One of the original articles on this can be found here.

Sunday, October 9, 2011

Storming SIP - VoIP Systems Are The New Target

You Will Be Attacked - Reduce Exposure

When implementing a VoIP infrastructure or any kind of network technology, it is best to reduce the exposure to attack. The fact that the VoIP infrastructure is typically sitting next to other network entities makes the SIP network elements reachable and possibly vulnerable to an attack coming from the other network servers. The number of VoIP phones and PBXs on the Internet is constantly growing, and if the infrastructure does not require exposure to the Internet, then avoid it. To help you separate the VoIP network from the rest, various network switch vendors allow you to set up a VLAN specifically for VoIP. However, be aware that VLANs are not a panacea, and tools like VoIPhopper make it easy to demonstrate the fact that VLAN is not enough. Cisco published a white paper called VLAN Security, where they describe how to protect against a number of attacks aimed at VLAN technology. Segregating the VoIP network can also be done through the use of firewalls or physical separation. VPN tunneling has also been previously suggested because it provides both encryption and can also be used to separate the VoIP traffic from the normal traffic.

However, these solutions might not always be feasible – especially since one major advantage of VoIP is that it integrates with other network elements on the Internet. In fact, various VoIP vendors market the fact that you can use your existing network infrastructure without having to lay new cables. Whether or not this is a good idea depends on a large number of factors. When designing a VoIP infrastructure, it is therefore important to understand the requirements and mitigate depending on the case. For example, a hotel VoIP network will have different requirements than a corporate IP phone network, and therefore a systems designer can apply different security precautions during the planning stage. Some other suggestions and observations:

• It is of course good to make use of encryption mechanisms such as TLS and SRTP. Unfortunately, the encryption for SIP and RTP is not yet widely supported. Zfone by the creator of PGP is particularly interesting. We shall not be going through this subject in depth since it is not within the scope of the attacks described within this article, but it definitely deserves a mention.

• The importance of good passwords for IP Phones should not be underestimated. If the system does not require that end users set their own password, then do not allow this functionality. Instead, make use of some kind of password management and set their password to one that is unique and hard to guess. Applications such as KeePass, which is open-source and free, allow you to generate strong random passwords for you, as well as manage such passwords in a relatively secure manner.

• OpenSER, which is an open-source SIP server, has a module named pike. This module is able to block requests that exceed a given limit. This can allow for blocking of both extension guessing and password cracking. However one has to be cautious with such solutions. Attackers can make use of IP spoofing to intentionally block legitimate traffic. It might also unintentionally block legitimate traffic if its not properly conFigured.

• SIP allows extension lines which do not require authentication. If there is no justification for unauthenticated extensions, then make sure NOT to use this feature.

• Hardphones will get security fixes in the form of a firmware update, while softphones will get a new software release. Keeping up to date with the latest versions can be a pain, but it is certainly one way of making sure your system does not fall victim to attackers exploiting a security vulnerability in your SIP phone.

You Are Under Attack

Detection is a very important step in a security solution. A network IDS such as Snort, when placed at the right location, can be of great help when trying to detect that an attack is underway. Snocer, which describes themselves as providing Low Cost Tools for Secure and Highly Available VoIP Communication Services, has previously published some Snort rules for public consumption. These rules are also available in the latest Snort community rules. In this section we will describe some of them and explain how they can be effective in catching the attacks mentioned previously. We will also provide some new Snort rules which can also detect activity described in this article and not caught by the current Snort community rules.

The Snort rules by Snocer are quite easy to understand, and are able to provide generic detection. Each of the rules looks out for an excessive number of SIP messages coming from a single IP address over a short period of time. The different SIP messages are INVITE and REGISTER requests, and 401 Unauthorized messages.

The INVITE and REGISTER flood attacks catch svwar and svcrack being run with default options against a SIP proxy. To be able to catch a default svmap scan, we need to be looking out for SIP messages with an OPTIONS request, spanned over different hosts in a short time. Listing 20 shows one such rule that triggers an alert if the rule is infringed 30 times in 3 seconds. One should probably adjust this rule depending on the address space being watched by Snort. If Snort is watching a /29 mask, i.e. only 6 hosts, then one should change the count to 6 and number of seconds to 1 or less. On larger networks, increase the count number to decrease the chance of a false positive.

The rule on excessive number of SIP 4xx Responses attempts to catch the majority of attacks outlined in this article. What it effectively does is match responses which contain a client error. This may be a 404 not found response like the one given by an Asterisk box when running svwar to identify SIP extensions or users. It will also match a password cracking attempt on an Axon PBX, or an extension enumerating attack on a Brekeke PBX when using svwar with the OPTIONS method. Of course, it will not catch a network scan for SIP devices on one which does not have a lot of devices, simply because the number of responses would be low.

The ghost phone call can also be easily detected since it generates a large number of ringing messages. Of course a payload of this attack is audible, and therefore the benefits of adding this rule might not be immediately apparent since it makes itself so obvious. However, a Snort rule at this stage might be very useful during incident response, when trying to determine things such as the source of the attack. The rule should be modified depending on the network. For example, it does not make sense to deploy this Snort rule on a calling center that takes 50 calls every minute.

Snort is not the only tool to monitor your VoIP infrastructure for attacks. In fact, Snort would very likely NOT detect any attacks passing through encrypted traffic. On the other hand, monitoring the logs on your IP PBX might be a good way of detecting some attacks destined to the SIP gateway. J. Oquendo posted a BASH script called astrap which monitors the Asterisk log entries for excessive number of failed authentication attempts. This small tool will list the offender's IP address, the number of password failures, and the extensions that were targeted on the Asterisk.

A host intrusion detection system such as OSSEC can be equally useful in detecting and automatically mitigating attacks. At the time of writing, OSSEC does not come preconFigured to support Asterisk log files, but this functionality can be easily added. Listing 21 includes a sample rule file for OSSEC to show how it can be conFigured to detect username enumeration and password attacks on an Asterisk system such as Trixbox. Listing 22 shows the changes required to enable this new Asterisk rule. We include a decoder entry so that OSSEC will be able to extract the attacker's IP address and then use that to automatically block the attack by adding the appropriate firewall rule.

References

• http://www.ietf.org/rfc/rfc3261.txt – RFC 3261

• http://www.iptel.org/sip/intro/purpose – Purpose of SIP

• http://www.wormulon.net/ – smap

• http://sipvicious.org/ – SIPVicious tool suite

• http://tinyurl.com/rtjl8 – SIP peers external authentication in Asterisk/OpenPBX

• http://www.hackingvoip.com/ – SIPSCAN

• http://www.oxid.it – Cain and Abel

• http://tinyurl.com/yph6jy – Interview with Robert Moore

• http://tinyurl.com/56bwd – VLAN Security White Paper

• http://www.snocer.org/Paper/sip-rules.zip – Snocer, snort rules

• http://www.infiltrated.net/scripts/astrap – astrap

• http://www.ossec.org/ – OSSEC

• http://www.trixbox.org/ – Trixbox

Hat Tip - Sandro Gauci

Wednesday, October 5, 2011

SIP Peering KPI’s - How to Measure Answer Seize Ratio

Service providers have for many decades measured key performance indicators for their SS7 interconnects with long-distance or international operators or peering partners. Such measurements are defined in ITU-T Recommendation E.411 "International Network Management – Operational Guidance" and E.422 "Quality of Service for Outgoing International Calls" and include Answer Seize Ratio [ASR], Post Dial Delay [PDD] and Network Efficiency Ratio [NER].

Name	Description
Counted	Number of calls
ASR	Answered calls (percent)
SSB	Subscriber busy (percent)
CGC	Circuit Group Congestion (percent)
SEC	Switching Equipment Congestion (percent)
CFL	Call Failure (percent)
RSC	Reset Circuit Signal (percent)
UNN	Unallocated Number (percent)
ADI	Address Incomplete (percent)
CLF	Clear Forward (percent)
LOS	Line Out of Service (percent)
rt	Response time (average)
wt	Wait time with answer (average)
wna	Wait time with no answer (average)
ct	Call time (average)
ht	Hold time (average)
Minutes	Total Call time in minutes

Table 1

Answer Seize Ratio [ASR] is used as a measure of network quality although the measurement also includes user behavior. In other words, if the call was not answered, the network could not be faulted, although the ASR measurement would be reduced by the uncompleted call, indicating lower quality. However, because for a given hour within a day, unanswered calls would always represent the same percentage, from day-to-day, this offset would be normalized out and carriers are able to monitor the trend of ASR and treat it as a relative measurement

Network Efficiency Ratio was designed to eliminate user behavior as a factor and better represent pure network performance.

Network Efficiency Ratio [NER] is defined as:

User Answers or Normal call clearing - Cause code: 16

+ User Busy - Cause code: 17

+ Ring No Answer - Cause code: 18 & 19

+ Terminal Rejects) - Cause code: 21

NER = -------------------------------------------------------x 100

(Total # of Call Attempts i.e. IAM’s)

SIP is a more flexible protocol and has wider uses than simple call control. Therefore, use cases differ widely and systems such as voicemail and call forwarding skew expected behavior for answered calls

Furthermore, SIP is a more flexible protocol and has wider range of response messages which can be used to specifically indicate certain types of failures from either servers, network devices or the movement or absence or other behavior of users. Accordingly, IETF has defined KPI's equivalent to Answer Seize Ratio and Network Efficiency Ratio. These are described under SIP End-to-End Performance Metrics draft ietf-pmol-sip-perf-metrics-04. For example, the equivalent of ASR is Session Establishment Ratio (SER)

Session Establishment Ratio (SER) is defined as follows, to quote the IETF

“This metric is used to detect the ability of a terminating UA or

downstream proxy to successfully establish sessions per new session

INVITE requests. SER is defined as the number of new session INVITE

requests resulting in a 200 OK response, to the total number of

attempted INVITE requests less INVITE requests resulting in a 3XX

response. This metric is similar to Answer Seizure Ratio (ASR)”

The SER is calculated using the following formula:

# of INVITE Requests w/ associated 200 OK

SER = --------------------------------------------------------- x 100

(Total # of INVITE Requests)-(# of INVITE Requests w/ 3XX Response)

Here is the message flow which defines the basic SER. if the session INVITE request results in an interim response, such as a 302 Redirect response, this should be subtracted from the denominator.

UA1 UA2

| |

|INVITE |

+----------->|------------------>|

| | 180|

| |<------------------|

Session Established | |

| | |

| | 200|

+----------->|<------------------|

| |

In SS7, the fate of the call is determined by the RELEASE message which ends the call. The fate of a SIP call is determined by the Response to each Request for each transaction within the SIP call. For example as above, session establishment is determined by a successful outcome from the call session establishment phase

The SIP Peering KPI’s RFC 6076 also defines Session Establishment Effectiveness Ratio (SEER) which is similar to Network Efficiency Ratio [NER] in the SS7 ISUP circuit switched world.

This metric is complimentary to SER, but excludes the effects of the terminating UAS or endpoint i.e. it excludes user behavior from the metric and therefore more closely reflects the performance of the network. SEER is defined as the number of INVITE requests resulting in a 200 OK response and INVITE requests resulting in a 480, 486 (Busy Here i.e. that endpoint is busy), or 600 (Busy Everywhere; i.e. the interconnecting network is busy or congested or down) to the total number of INVITE attempts less the 3xx, interim responses.

In order to simplify the formula, the following variable ‘a’ is used to summarize multiple SIP responses:

a = 3XX, 401, 402, and 407

The SEER is calculated using the following formula:

# of INVITE Requests w/ associated 200 OK, 480, 486, or 600
SEER = -------------------------------------------------------- x 100
(Total # of INVITE Requests)-(# of INVITE Requests w/ 'a' Response)

SIP response codes

2xx—Successful Responses

Eg 200 OK

4xx—Client Failure Responses

401 Unauthorized (Used only by registrars or user agents. Proxies should use proxy authorization 407)
402 Payment Required (Reserved for future use)
480 Temporarily Unavailable
486 Busy Here

6xx—Global Failure Responses

600 Busy Everywhere
603 ADD

Another very useful measurement defined by the IETF here is Session Defects Ratio (SDR), also graphed by Palladion by checking a box. ‘503’ SIP Response messages commonly indicate a route to one of your peering partners is failing due to congestion of the Gateway or SoftSwitch.

Session Defects Ratio (SDR) is the percentage of call attempts receiving the following responses, in relation to total call attempts or INVITES:

o 500 Server Internal Error

o 503 Service Unavailable

o 504 Server Timeout

The SDR is calculated using the following formula:

# of INVITE Requests w/ associated 500, 503, or 504
SDR = ----------------------------------------------------- x 100
Total # of INVITE Requests

In addition to this, some carriers like to plot Post Dial Delay, [PDD] and to generate an alert when this measurement exceeds a certain time threshold. PDD is defined as the time interval between transmission of the INVITE and reception of the ‘180 Ringing’ response message. The RFC prefers to define Successful Session Setup [SRD], the SIP equivalent of Post-Selection Delay (defined in E.721). This is an early indication of congestion soon to or about to occur on a given route and is a very useful KPI. Palladion can be configured to provide an SNMP trap or other notification to the Network Operations Center [NOC] of this imminent congestion so preemptive action can be taken.

Here is a table listing all the new RFC 6076 KPI’s with their SS7 or circuit switched equivalent:

SIP Peering KPI’s RFC 6076	SIP Definition	ISUP Equivalent	ISUP Definition
Registration Request Delay (RRD)	Time of final response – time of REG attempt	N/A Note *
Ineffective Registration Attempts	# IRA/total REG attempts	N/A
Session Request Delay (SRD)	Time of response – Time of INVITE
Successful Session Setup [SRD]	Time of response – Time of INVITE eg INVITE to 180	Post-Selection Delay (defined in E.721)	IAM to ACM (or ALERTING)
Failed Session Setup [SRD] and	INVITE to response indicating failure eg 4XX (excluding 401, 402, and 407 non-failure challenge response codes), 5XX, or 6XX message.	N/A	IAM to REL with a cause code indicating a failure
Successful Session Disconnect Delay [SDD]	BYE, to 2XX Ack	Time to Clear a good Call or CIC	REL to RLC
Failed Session Disconnect Delay [SDD]	BYE, to Timer F Expires	essage	Missing RLC i.e. expiry of ISUP T1 timer
Session Duration Time [SDT]	Time of BYE or time out – Time of 200 OK	Time of REL - Time of ANS	Average Call Hold Time (ACHT)
Successful Session Duration Time	200 OK response to an INVITE to BYE	See note below *
Failed session duration SDT	200 OK response to an INVITE, and the resulting Timer F expiration.	“awaiting Answer timer”	SS7 ISUP timer T9 (no response to ACM)
Session Establishment [SER ] Ratio	# of “good call” INVITEs ---------------------------------- X % Total INVITEs – interim 3xx Responses	Calls which connect Calls which fail	Answer Seize Ratio (ASR)
Session Establishment Effectiveness Ratio (SEER)	(# of INVITE Requests w/ associated 200 OK, 480, 486, or 600) ------------------------------ X % (Total # of INVITE Requests)-(# of INVITE Requests w/ 'a' Response)	NER = Answers (cc 16) +UserBusy-cc 17 + Ring No Answer (cc 18, 19 & 21) + Terminal Rejects) _____________ Total call attempts	Network Efficiency Ratio (NER) ITU E.411.
Ineffective Session Attempts (ISA) Invites resulting in:- o 408 Request Timeout o 500 Server Internal Error o 503 Service Unavailable o 504 Server Timeout	# of ISA x 100 ----------------------------- Total # of Session Requests		Ineffective Machine Attempts (IMA) in telephony applications of SIP, and was adopted from Telcordia GR-512-CORE [GR-512].
Session Completion Ratio (SCR) (a Session Completion is any SIP dialog that receives a valid response)	# of Successfully Completed Sessions x 100 ------------------------------------------ Total # of Session Requests	Call Completion Ratio (CCR)

Notes:

· * REGISTRATIONs are rare across SIP peering points as these interconnects are typically between two trusted environments. In addition SBC’s are used to protect the sanctity of each network. Similarly an interconnect between two SS7 networks is a trusted and secure interface, so accordingly, no equivalent of REGISTRATIONs exists in SS7, although the MTP3 protocol is used to establish routing between Signaling Point Codes and ISUP control messages to set up voice trunks

· ** Monitoring for short duration Successful calls in a circuit switched network has not been as important and for a VoIP network as speech quality is not so much of an issue in circuit switched networks. In a VoIP network, short calls are important to monitor because if speech quality is so bad, the callers hang up immediately and call back, ….hopefully.

References

· Basic Telephony SIP End-to-End Performance Metrics Request for Comments: 6076

· ITU-T Recommendation E.411 : International network management - Operational guidance

· ITU-T Recommendation E.422 : Observations on international outgoing telephone calls for quality of service

· Telcordia GR-512-CORE - LSSGR: Reliability, Section 12