Thursday, December 15, 2011

The Essence of VoIP Issues

In the telephony world VoIP is a hybrid.

A VoIP solution usually has dedicated Internet connections, dedicated T1 voice circuits, and at two or or more points in a call a digital to analog conversion.

On top of these common telephony components, you add the complexities associated with IP packets:  latency, real time transport of packets, call control channels translations, and dial plan nuances.

One of the first questions to ask yourself when troubleshooting a VoIP call is whether or not a problem exhibits itself on all of your calls or just select ones.

In future posts we will begin to begin to isolate trouble to the various components of the call circuit and learn what clues to look for with each isolated issue or problem.

Friday, November 18, 2011

The True Customer Experience of a Voice Service

Customers perceive the quality of a voice service and the reliability of a voice service based on many different experiences. Such experiences may be good or bad, but very often customers notice only the bad experience – obviously as they buy a service that is meant to work 100% at any time and from anywhere.

Bad experiences are often caused by network related problems:

• Calls with degraded voice quality
• Interrupted or Dropped calls or VoIP Dropped calls
• Unsuccessful call attempts
• Missed calls that did not ring or calls that were not signaled

Also end device related problems bring negative user experience:

• Empty battery on smartphone
• End device & VoIP Endpoint crashes
• Inability to setup 3 party conference call

For all voice operators, over-the-top (OTT) and legacy networks, these end-device related troubles are very important. Because in the end the customer does not care at all why something is not working! He will blame it on his voice service provider anyway.

As a result when considering the management of customer experience in next generation voice networks, it is very important to look beyond your own network infrastructure. Internet connections, utilization and network problems on the customers’ corporate LAN and end devices cause a large portion of the problems. So it would definitively not be a good idea to consider a service running well, simply because in one’s own network, all systems are up and running.

Instead operators also need to manage:

• End device & VoIP Endpoint firmware versions
• App/softphone software versions
• IP link quality characteristics
• Performance and problems at your Customers’ premises
• Error logs from the end device


Very often problems can be solved by optimizing configurations, codec settings, centralized firmware updates or pro-actively contacting the customer about up-coming problems with his device or software. So a tool is needed that see the endpoint or VoIP device as well as the network. But for all of that, a customer experience management system must provide the full set of information about how the customer truly perceives the service End-to-End.

When looking at customer experience management for VoIP networks, operators should choose a holistic approach and carefully select software solutions that are designed to look far beyond the usual scope and take care about the most important stakeholder in this game – the customer.

Monday, November 14, 2011

VoIP Service Assurance Available in the Cloud

Due to the huge amounts of traffic, telephone network analyzers are complex systems with beefy hardware requirements, complex setup procedures and the need for constant monitoring. For bigger networks the costs to run them is easily worth it, but for smaller networks deploying them might not be feasible. Be it for lack of time to learn how to run them, or because of a too big initial investment.

A recent trend in IT is the introduction of cloud services including for example cloud based service assurance or cloud based VoIP troubleshooting. In this setup the main part of the analyzer system is run in a central system, where it is managed by a knowledgeable and dedicated staff experienced in network monitoring and efficient VoIP troubleshooting. This leads to rapid deployment times and minimal investment to the point where one can just 'Try before you buy' so in other words, troubleshooting on Demand. The advantage of economy of scale brought by cloud based Network Management or cloud based Network Monitoring makes the most advanced analyzers available for even the smallest VoIP providers, leveling the playing field for all. You can even install it at the last minute in just a few seconds and use it when you have a problem

By deploying a VoIP traffic analysis in the cloud, you can usually achieve up-times far better than by using a complicated local setup and you can make sure that your running costs stay predictable. With today's technologies, security and data safety are a well-managed issue, all traffic streams being encrypted. A centralized software setup or hosted service assurance makes updates easier and your service provider is also able to diagnose and fix problems more efficiently by having complete access to the stack.

Cloud-based service assurance or network monitoring or VoIP troubleshooting can now be made available on a pay for use basis. Basically, you pay when you login to troubleshoot a problem on your network. If you do not have a problem, do not ever log in, you do not pay a cent. If you do login, for a small charge for one use, you fix the urgent problem that is troubling your customer. The return on investment for use of the network troubleshooting service is instant. The advantage of having paid the Per-Use fee is obvious to your management, because you have just fixed a difficult problem impacting your voice service. Keeping your customer, your customer and allowing you to move on to creating a new service or engineering your network

Cloud-based network monitoring gives you universal visibility and centralized and universal access to the analysis. One Installation and setup is so rapid it can be used on-demand when needed i.e. VoIP troubleshooting on Demand. In summation, cloud computing brings lots of advantages to VoIP service providers, (ITSP’s) who could not previously afford or justify the expense for a traffic analysis. With generally better availability, a transparent pricing structure (based on usage) and better support options, cloud solutions are surely an understandable trend in today's VoIP software stack market.

Thursday, November 10, 2011

Network Visibility Matters

“Knowledge is power” as the phrase goes and effectively, no matter what you want to achieve in your network management, a truly in-depth knowledge about your network processes is THE key asset when it comes to offering an excellent Customer Service and troubleshooting or to managing the overall network health with ease. Network knowledge and visibility allows you to find problems in minutes rather than hours, reducing operations costs and impressing your customer with rapid resolution of problems.

Today’s fast evolving telecommunication market with its multiple video & audio streams, a fast growing user base and geographically dispersed deployments is challenging and keeping an eye on the entire network doesn’t seem to be an easy task. Often, telecommunication operators are recurring to different systems or are not able to track back single network incidents which don’t provide a satisfying network visibility in the end.


First, choosing a solution that is able to process all signaling messages traversing the network in real-time and that instantly correlates multiple sites and network protocols is vital. Also, you will want to know the detailed status of the entire network health, the different application layers of the network, the end user devices and all the segments that a media stream is traversing through. A fully correlated end-to-end overview of all parts in the network can easily be provided when distributed agents are installed locally enabling a quick root-cause detection of network issues.

Talking about visibility, it is not only helpful to have an overall view of what is happening in the network, drilling-down into single calls, users, end devices and network devices is very valuable information as well. The perspective should be changed easily from a high-level overview for e.g. the operations department down to a granular and in-depth view of an individual user and its history which is helpful information for the customer service or support department. A filtering option will furthermore provide a high value to track users or to search for specific devices, numbers or customer groups, we which are known as realms.

A powerful network management tool should be able to provide past information about historical data, process data in real-time and should also allow a current and future proactive network management. Finally, a future-proof solution is a system which is easily integrated into an existing infrastructure reducing hereby the complexity, and instead of installing more hardware, it should be a lightweight solution that smoothly grows with the network.

A real-time intelligence software suite increases the ROI of LTE, IMS and VoIP deployments. It provides fixed and mobile operators a full end-to-end visibility and supports a multi-site and protocol call correlation.

Wednesday, November 2, 2011

One Way Audio and the Inability to Receive Incoming VoIP Calls


As a VoIP Service Provider, you will no doubt have heard customers complaining about these two problems. In one case, the customer makes a call or Receives an incoming call and can hear their calling party. However, the calling party complains they cannot hear you. This is possibly because the VoIP Endpoint phone has a problem or there could be routing problem in the network.



More commonly, this is a problem with the firewall belonging to the enterprise. The enterprise network Administrator prefers not to allow incoming data sessions and for the VoIP session, the incoming RTP uses a different port to the outgoing RTP. The UDP port number used by RTP is negotiated by the SDP inside the SIP signaling at the commencement of the session or phone call. The actual details are subject to the implementation and configuration of the network and sometimes outgoing and incoming UDP ports for the RTP may be dissimilar.



Re-configuring the firewall and ensuring that the RTP port is the same for incoming and outgoing RTP streams will rectify this problem.

A similar problem, but more difficult to isolate, is where a customer complains they can make outgoing calls but their colleagues say they called your customer but their VoIP phone did not ring. The user can make outgoing calls. Failure of the incoming call is caused by REGISTRATION of their VoIP phone having expired on its registration server.


When a VoIP endpoint successfully registers on the registration server or the network, the registration server includes in its response, a time frame during which the REGISTRATION is valid. The VoIP endpoint/phone should RE-REGISTER with the registration server before half of the time period has expired.

Frequently, there is an imbalance between when the endpoint believes it should renew the registration and when the registration server expires the registration. A tool which can track the state of registration [New, Unauthorized, Expired or Gone] is invaluable in being able to find these stealth problems.



Wireshark will show you the time and relationship between such REGISTRATION messages. However, an advanced intelligent Service Assurance tool such as Palladion will track the state of each endpoint, and also the firmware release of each endpoint, allowing you to find these anomalies in seconds.

Saturday, October 29, 2011

MOS [Mean Opinion Score] for High Definition or Wideband Telephony

Mean Opinion Score [MOS] is a scale from 1 to 5 indicating speech quality - 1 is bad and 5 is excellent. MOS test sessions comprise 15 to 25 people listening to speech files of good quality and of poor quality with impairments and scoring them subjectively. This subjective test process is specified in ITU-T P.800. In over 16 years where these tests have taken place, no statistically significant number of participants ever scored any speech recording as being excellent or 5.0. The highest score typically obtained in any test was 4.5.

High Definition or Wideband Telephony speech uses the new POLQA speech quality metric for objective protection of MOS. The old PESQ algorithm has been used for narrowband telephony since it was approved in 2000. It is desirable to use the same scale so that laboratories can compare new results for wideband telephony with their old PESQ database. However, the question of human expectation comes into play because all these objective measurements performed by computers must correlate or predict subjective experience. If you watch a video on your smart phone, you might consider the picture quality as being good. Your expectations are put in the context of the small screen and the convenience of the video being played on a handheld smartphone. If you would give you the same video on your brand-new expensive high-definition 1080P TV, you would be very disappointed even if the pixel resolution had been scaled to the 62 inches screen size. Your expectation of quality is tempered to the format in which you are viewing it.

Similarly with speech and audio. If you were to participate in a MOS test and invited into a studio where there were high fidelity speakers, orchestral classical music playing and told and asked to rate the quality of the High Definition speech you are about to hear, your expectations would be set high and you'd be more critical. You would score the audio lower than if you had been asked to rate the speech quality of your most recent cellular phone call.



POLQA offers two scales, the narrowband scale and the super wideband scale. Super wideband telephony reaches 14 kHz analog audio frequency. The narrowband focus scale maps directly onto the old desk scale and exploits the higher scores not given by test participants in narrowband tests.



• NB: Maximum MOS value 4.25

• WB: Maximum MOS value 4.5

• SWB: Maximum MOS value 4.75



So a score of 4.5, on the narrowband POLQA scale is experimentally the best value you will ever obtain with wideband telephony equipment. You could conceivably measure a MOS value of 4.75 if you were measuring super wideband equipment.


In future years, the industry will migrate exclusively to using the super wideband POLQA scale as soon as users' expectations always expect high-definition or hi-fi quality to the communications audio.

The picture shows the iLBC codec measured measuring 4.21 narrowband focus scale.


 

For more information on making PESQ and POLQA measurements, ensure you contact only renown and well-respected test vendors because the science of speech quality measurements requires expertise and experience in many different areas audio, analog electronics as well as computing. It is easy to make a measurement but care is required to ensure that measurement is accurate and correlates to human subjective experience


The most trusted vendor for speech quality metrics is Malden Electronics, available in USA through Teraquant Corporation – www.teraquant.com


See use in http://technorati.com - X84QTD2E9BS6

Tuesday, October 25, 2011

Can You Hear Me Now?

PolQA is the new ITU-T Standard for Speech Quality Measurement which embraces Wideband or High Definition Telephony.

“Can you hear me now?”

We’ve all heard the refrain. How often have you been on a mobile phone & not been able to hear your calling party? How often have you experienced drop-outs on a VoIP call and missed that vital clue, that's important piece of information the caller mentioned which allowed you to understand their needs. May be you lost the business as a result. Good clear speech quality means productivity, both in business and in personal life. Everyone is critically busy these days and if you have to ask folks to repeat themselves, you waste time, first-rate meaningful conversation and miss information.

The existing telephony network uses 200-34000Hz analog bandwidth, digitized at a sampling rate of 8kbps. 8 bits of vertical resolution multiplied by 8kbps gives the traditional 64kbps bandwidth required for a voice channel. Compression by codecs such as G.729 and iLBC VoIP and specifically iSAC for Skype and GSM-FR & EVRC for wireless transmits narrowband traditional telephony at data rates as low as 4kbps.
So now we can compress voice sports to very low bandwidths and at the same time we have broadband Internet. so what can we do to improve speech quality.

Wideband or High Definition Telephony technology is now appearing in VoIP networks and wireless networks using voice codecs such as G.722 and WB-AMR. This provides speech with an analog bandwidth up to 7kHz and gives a richer listening experience. Those problems you currently have trying to recognize which of your young nieces or nephews is speaking to you is due to high frequencies filtered out with narrowband telephony. Wideband telephony will reinstate these, enriching your telephone conversation experience and improving productivity through speech clarity. This technology will eventually send telephony speech all the way up to 20 kHz, the limit of human hearing, equivalent to hi-fi music systems.

3gpp release 5 introduces AMR-WB codec which gives enhanced speech quality using data rates of only 16kbps. So wideband telephony or high definition telephony is being made available to wireless cellular networks.

Tools to Automatically Measure Speech Quality

Determining the subjective speech quality of a transmission system has always been an expensive and laborious process. The tool described in ITU-T Rec. P.862 Perceptual Evaluation of Speech Quality – PESQ provides a rapid and repeatable result in a few moments. PESQ is an objective measurement tool i.e. a computer measures the quality of the received audio in relation to the audio that was transmitted. PESQ predicts or has a very accurate close correlation to the results of subjective listening tests [i.e. human beings listening to speech files] On telephony systems. The resulting quality score is analogous to the subjective “Mean Opinion Score” (MOS) measured using panel tests according to ITU-T P.800. Strictly speaking, MOS is a score derived from human subjective testing. The PESQ scores are calibrated using a large database of subjective tests.

The ITU-T selection process that resulted in the standardization of PESQ involved a wide range of conditions, with demanding correlation requirements set to ensure that it has good performance in assessing conventional fixed and mobile networks and packet-based transmission systems.

Since ITU-T Rec. P.862 was originally released in 2000, further mappings of the PESQ score have been created. PESQ-LQ modified the score to improve correlation with subjective test results at the high and low ends of the scale where the raw PESQ score was found to be less accurate. A new mapping described in ITU-T Rec. P.862.1 was been released that further modified the raw score and correlated better to subjective testing.
PESQ Shortcomings - Time Warping

PESQ takes into account coding distortions, errors, packet loss, delay and variable delay, and filtering in analogue network components. The user interfaces have been designed to provide a simple access to this powerful algorithm, either directly from the analogue connection or from speech files recorded elsewhere.

PESQ Shortcomings
Noise Reduction:  (Subjective  >  PESQ)
The performance of a network or a network element can be fully characterized using high quality analog test equipment and PESQ. High quality analog interfaces are needed because the test equipment itself very easily introduces impairments which are included in the measurement and drank the desk score lower than should be measured for the system under test or network element. Whilst it is possible to use phonetically balanced sentences and other test patterns, accurate and repeatable measurements of the active speech level, activity, delay, echo, noise and speech quality can be obtained quickly using artificial speech test stimulus in different languages, which comprehensively tests all voice sounds the codec may be incident with, but at the same time achieves the process quickly in a time efficient way. A graphical mapping of the errors provides a useful insight into how the signal has been degraded and exactly what kind of sounds course the codec core system and test problems.

Since the launch of PESQ in 2000, there have been many advances in codec design. Unfortunately, PESQ was not trained on these later designs and can produce scores that are lower than expected from subjective tests. Time-warping and voice quality enhancement techniques are particularly difficult for PESQ. The ITU agreed on a new standard, P.863 POLQA, in 2010. POLQA addresses many of the issues and produces reliable scores for codecs, both old and new. POLQA is Now available on a couple of speech quality measurement platforms but Malden is the only platform that provides a quiet, high-quality analog front-end and the only platform to be recommended.

Friday, October 21, 2011

Network Monitoring - It's Your Livelihood


What are your daily duties as the VoIP network operations director? You have to keep your network up and running. You have to answer calls which may relate to situations like “X location is down” or “Y location is slow”.  My we suggest that you proactively monitor your network as described below and perform tasks like:


  1. Monitor your network and take actions with respect to situations like device and line failures.
  2. Analyze line/physical facility utilization, errors on the facility and be sure about network performance and conformance to SLAs.
  3. Be aware of what "talks to what" and when?  Be sure how much bandwidth is needed for every single application riding your network (and the networks you traverse.)
  4. Know your exact data flows over your networks.
If you have all this information at your disposal, people will think twice before they point finger at you. 

But how can you achieve this?

You need a phased approach to understand network monitoring. I am not talking about network layers, but network monitoring layers. We have to involve deeply to monitoring layers before deciding about network monitoring software needs. A simple summary could include these:

  • Preconditions of network monitoring.
  • Up/Down monitoring
  • Performance Monitoring / SNMP monitoring
  • Who talks with whom? / Netflow monitoring
  • Data capture / Data sniffing
Preconditions of Network Monitoring
Network documentation is essential to monitor a network. Trying to set up network monitoring tools before going through the documentation is a complete waste of time. You will see everything green on the screen, but this maybe due to one of the redundant lines that are down. You will sit staring without knowing what is happening. Always remember, documentation comes first and everything follows.
Suggested Network documentation tools: Powerpoint/Visio, NetViz


Up/Down monitoring
Design a map in which you can see some red and green lights glowing. Green means up and red means down. It is simple yet powerful. You will immediately come to know that there is some problem if the red light glows.This is based on ping. Almost every IP devices support echo/echo reply. So, you can monitor all IP devices in your network by using ping.  Go one step further by monitoring one application at a time present on a device instead of whole device. All network applications utilize TCP/UDP ports. You can monitor the applications by trying to access with telnet to its TCP/UDP ports. The port being open suggests that the application is running
Suggested monitoring tools: WhatsupGold, nmap


Performance monitoring / SNMP monitoring
The lines are up, the devices are up, but life is not perfect. People may complain about the performance of data lines, but are they saturated or do they have plenty of spare bandwidth?  Is there packet loss on the lines? Are routers running out of memory? We need SNMP to monitor the heart beat of the network.
Suggested monitoring tools: MRTG, Solarwinds Orion, PRTG


What is "talking" with what? / Netflow monitoring
You may realize that the line is full, but is someone or some applications increasing traffic load enormously. Who are they? Is it necessary traffic? In some devices, by using “ip accounting” command you can get an idea of current traffic sources and destinations. Nevertheless, to analyze and to optimize the traffic we need flow monitoring. We need to know source and destination IP addresses and TCP/UDP ports and number of packages/bytes.


Everyone blames the network speed until you publish an network usage report that clearly shows only 15% of the traffic is ERP traffic and rest comes from Internet access.You should know that flow monitoring tools requires more server resources, since they collect enormous amount of data.
Suggested monitoring tools: Fluke Netflow monitor, Paasler


Data capture / RMON – Sniffer tools
Sometimes you need to observe the exact data flow on the line and not just information about it. Just have a look at this sample scenario. After you find out that the web service causes inappropriately high network traffic, the owner of the application just can say “No, we are not pushing this much of data to network. We just respond Yes or No in this web service and it is just 100 bytes”. Therefore, you should sniff the data flow on the line. Maybe, you will find that web service responds yes or no (100 bytes) and with the definition of web service (6 kilobytes).
Suggested monitoring tools: Wireshark, Palladion


You can have a look at Network Monitoring Tools in Stanford University web site for a superb list of network monitoring tools. You can find another tidy list at Network Traffic Monitoring in Alan Kennington’s topology.org.

Monday, October 17, 2011

IP PBX Toll Fraud - Everyone Is At Risk

It seems no one is safe. Every IP Telephony Service Provider [ITSP] I have talked to has suffered an IP PBX hack. End Users of VoIP phones most commonly do not remember or know how to change the password on their PBX or voicemail account. Default passwords for a PBX are usually the last four digits of the phone extension so hackers can easily cycle though to determine a weak or discoverable password. Once into the PBX, they can originate calls from anywhere on the Internet and pump traffic volume to numbers that will realize them fraudulent revenues

Voicemail can be configured to dial out eg when you hear the greeting “please wait while we attempt to reach your party”, of course the voicemail system is making an outbound call which is setup by the hacker to redirect to their intended destination.

The fraudster can resell such phone capacity. One of the nice advantages of VoIP is its built in features for Moves, Adds and Changes. When the time comes to move office, you can just pack you VoIP phone and take it with you. Plug it into the internet or your companies IP cloud and it will register with your IP PBX and you can make calls. If you travel, you can take your VoIP phone with you and plug it into the internet in your hotel room and makes calls as if from your desk. So The PBX has no concept of your physical location or who is using the phone and so fraudsters, once they hack your PBX can make calls from anywhere, using your account.

The Telecommunications industry has annual revenues of $2.1 Trillion. Telecom fraud is calculated to cost the telecom industry $40 Billion each year.

The calls come out of your IP PBX, to your ITSP, offering a SIP Trunking service (i.e. a service that routes calls from a VoIP environment to the PSTN and therefore to any expensive international destination offered. So your ITSP receives an invoice from the International carrier for all these international calls. Fraudsters often choose weekends or other times outside business hours to attack. This way, the attack goes unnoticed and the account is beaten to death whilst no one notices or cuts it off.

“How to lose your year’s profits in 15 minutes!” was the way one of our customers described Toll Fraud. The ITSP receives this rather large bill from the international carrier. If he demands payment from the enterprise which has allowed their IP PBX to be hacked, he will surely loose a customer. Small companies who typically have small telecom bills can afford suddenly to pay up for a $20,000 to $30,000 bill. Many events run into the hundreds of thousands.

Although the ITSP caught in the middle may be to share some of the costs with their customer and the international or long distance carrier, they cannot afford to lose customers and will not often impose the charges on their client. So this is a loss the ITSP usually has to take on the chin.

How can we detect it in real-time and turn it off in real-time to prevent the cost leakage? Advanced monitoring systems are now available which will not only detect the fraud attack in real-time but will also turn if off.

Thursday, October 13, 2011

11 Million Euro Loss - 1 Million in Profits - VoIP Fraud



Recent Voice over IP fraud attackers made over 1 million Euro in profits. 
It seems that they simply were scanning for PBX servers with phone extensions that have weak passwords. Then they abused these accounts to make phone calls for "free", except that free has the price of 11 million EUR for the service provider victims!

Apparently, originally they used these accounts for their own personal phone calls. However they got greedy and between October 2009 to February 2010, they made 23500 calls / 315000 minutes to premium numbers. Then (from what I understood), they got even more greedy and used a Shadow Communication Company and prices for premium numbers that then linked to another site further obfuscating their "business." Using this scheme they recruited other people to make  1,541,187 fraudulent calls or 11,094,167 minutes of talk time.

They used other premium numbers affiliate networks at first and then, when they realized the potential, they set up a company in the UK - Shadow Communications Inc. - through which they were able to sign a  contract on their own with a premium rate number provider and offer their own affiliates service, basically taking their "business" to a whole new level.

One of the original articles on this can be found here.



Sunday, October 9, 2011

Storming SIP - VoIP Systems Are The New Target

You Will Be Attacked - Reduce Exposure


When implementing a VoIP infrastructure or any kind of net­work technology, it is best to reduce the exposure to attack. The fact that the VoIP infrastructure is typically sitting next to other network entities makes the SIP network elements reachable and possibly vulnerable to an attack coming from the other network serv­ers. The number of VoIP phones and PBXs on the Internet is constantly growing, and if the infrastructure does not require exposure to the Internet, then avoid it. To help you separate the VoIP network from the rest, various network switch vendors allow you to set up a VLAN specifically for VoIP. However, be aware that VLANs are not a panacea, and tools like VoIPhop­per make it easy to demonstrate the fact that VLAN is not enough. Cisco published a white paper called VLAN Security, where they describe how to protect against a number of attacks aimed at VLAN technology. Segregat­ing the VoIP network can also be done through the use of firewalls or physical separation. VPN tunneling has also been previously suggested because it provides both encryption and can also be used to separate the VoIP traffic from the normal traffic.

However, these solutions might not always be feasible – especially since one major advantage of VoIP is that it integrates with other network elements on the Internet. In fact, various VoIP vendors market the fact that you can use your existing network infrastruc­ture without having to lay new cables. Whether or not this is a good idea depends on a large number of factors. When designing a VoIP infrastructure, it is therefore important to understand the requirements and mitigate depending on the case. For example, a hotel VoIP network will have different require­ments than a corporate IP phone net­work, and therefore a systems designer can apply different security precautions during the planning stage. Some other suggestions and observations:
• It is of course good to make use of encryption mechanisms such as TLS and SRTP. Unfortunately, the encryption for SIP and RTP is not yet widely supported. Zfone by the creator of PGP is particularly interesting. We shall not be going through this subject in depth since it is not within the scope of the attacks described within this article, but it definitely deserves a mention.
• The importance of good pass­words for IP Phones should not be underestimated. If the system does not require that end users set their own pass­word, then do not allow this functionality. Instead, make use of some kind of password man­agement and set their password to one that is unique and hard to guess. Applications such as KeePass, which is open-source and free, allow you to generate strong random passwords for you, as well as manage such passwords in a relatively secure manner.
• OpenSER, which is an open-source SIP server, has a module named pike. This module is able to block requests that exceed a given limit. This can allow for blocking of both extension guessing and password cracking. However one has to be cautious with such solutions. Attackers can make use of IP spoofing to intentionally block legitimate traf­fic. It might also unintentionally block legitimate traffic if its not properly conFigured.
• SIP allows extension lines which do not require authentica­tion. If there is no justification for unauthenticated extensions, then make sure NOT to use this feature.
• Hardphones will get security fixes in the form of a firmware update, while softphones will get a new software release. Keeping up to date with the latest versions can be a pain, but it is certainly one way of making sure your system does not fall victim to attackers exploiting a security vulnerability in your SIP phone.

You Are Under Attack
Detection is a very important step in a security solution. A network IDS such as Snort, when placed at the right location, can be of great help when trying to detect that an attack is underway. Snocer, which describes themselves as providing Low Cost Tools for Secure and Highly Available VoIP Communication Services, has previously published some Snort rules for public consumption. These rules are also available in the latest Snort community rules. In this section we will describe some of them and explain how they can be effective in catching the attacks mentioned previously. We will also provide some new Snort rules which can also detect activity de­scribed in this article and not caught by the current Snort community rules.
The Snort rules by Snocer are quite easy to understand, and are able to provide generic detection. Each of the rules looks out for an excessive number of SIP messages coming from a single IP address over a short period of time. The different SIP messages are INVITE and REGISTER requests, and 401 Unauthorized mes­sages.

The INVITE and REGISTER flood attacks catch svwar and svcrack be­ing run with default options against a SIP proxy. To be able to catch a default svmap scan, we need to be looking out for SIP messages with an OPTIONS request, spanned over different hosts in a short time. Listing 20 shows one such rule that triggers an alert if the rule is infringed 30 times in 3 seconds. One should probably adjust this rule depending on the ad­dress space being watched by Snort. If Snort is watching a /29 mask, i.e. only 6 hosts, then one should change the count to 6 and number of seconds to 1 or less. On larger networks, in­crease the count number to decrease the chance of a false positive.
The rule on excessive number of SIP 4xx Responses attempts to catch the majority of attacks outlined in this article. What it effectively does is match responses which contain a client error. This may be a 404 not found response like the one given by an Asterisk box when running svwar to identify SIP extensions or users. It will also match a password crack­ing attempt on an Axon PBX, or an extension enumerating attack on a Brekeke PBX when using svwar with the OPTIONS method. Of course, it will not catch a network scan for SIP devices on one which does not have a lot of devices, simply because the number of responses would be low.
The ghost phone call can also be easily detected since it generates a large number of ringing messages. Of course a payload of this attack is audible, and therefore the benefits of adding this rule might not be immedi­ately apparent since it makes itself so obvious. However, a Snort rule at this stage might be very useful during in­cident response, when trying to deter­mine things such as the source of the attack. The rule should be modified depending on the network. For exam­ple, it does not make sense to deploy this Snort rule on a calling center that takes 50 calls every minute.
Snort is not the only tool to monitor your VoIP infrastructure for attacks. In fact, Snort would very likely NOT detect any attacks passing through encrypted traffic. On the other hand, monitoring the logs on your IP PBX might be a good way of detecting some attacks destined to the SIP gateway. J. Oquendo posted a BASH script called astrap which monitors the Asterisk log entries for exces­sive number of failed authentication attempts. This small tool will list the offender's IP address, the number of password failures, and the extensions that were targeted on the Asterisk.
A host intrusion detection system such as OSSEC can be equally useful in detecting and automatically mitigat­ing attacks. At the time of writing, OS­SEC does not come preconFigured to support Asterisk log files, but this functionality can be easily added. Listing 21 includes a sample rule file for OSSEC to show how it can be conFigured to detect username enu­meration and password attacks on an Asterisk system such as Trixbox. List­ing 22 shows the changes required to enable this new Asterisk rule. We in­clude a decoder entry so that OSSEC will be able to extract the attacker's IP address and then use that to au­tomatically block the attack by adding the appropriate firewall rule.

References
http://www.ietf.org/rfc/rfc3261.txt – RFC 3261
http://www.iptel.org/sip/intro/purpose – Purpose of SIP
http://www.wormulon.net/ – smap
http://sipvicious.org/ – SIPVicious tool suite
http://tinyurl.com/rtjl8 – SIP peers external authentication in Asterisk/OpenPBX
http://www.hackingvoip.com/ – SIPSCAN
http://www.oxid.it – Cain and Abel
http://tinyurl.com/yph6jy – Interview with Robert Moore
http://tinyurl.com/56bwd – VLAN Security White Paper
http://www.snocer.org/Paper/sip-rules.zip – Snocer, snort rules
http://www.infiltrated.net/scripts/astrap – astrap
http://www.ossec.org/ – OSSEC
http://www.trixbox.org/ – Trixbox

Hat Tip - Sandro Gauci

Wednesday, October 5, 2011

SIP Peering KPI’s - How to Measure Answer Seize Ratio


Service providers have for many decades measured key performance indicators for their SS7 interconnects with long-distance or international operators or peering partners. Such measurements are defined in ITU-T Recommendation E.411 "International Network Management – Operational Guidance" and E.422 "Quality of Service for Outgoing International Calls" and include Answer Seize Ratio [ASR], Post Dial Delay [PDD] and Network Efficiency Ratio [NER].

Name
Description
Counted
Number of calls
ASR
Answered calls (percent)
SSB
Subscriber busy (percent)
CGC
Circuit Group Congestion (percent)
SEC
Switching Equipment Congestion (percent)
CFL
Call Failure (percent)
RSC
Reset Circuit Signal (percent)
UNN
Unallocated Number (percent)
ADI
Address Incomplete (percent)
CLF
Clear Forward (percent)
LOS
Line Out of Service (percent)
rt
Response time (average)
wt
Wait time with answer (average)
wna
Wait time with no answer (average)
ct
Call time (average)
ht
Hold time (average)
Minutes
Total Call time in minutes

Table 1

Answer Seize Ratio [ASR] is used as a measure of network quality although the measurement also includes user behavior. In other words, if the call was not answered, the network could not be faulted, although the ASR measurement would be reduced by the uncompleted call, indicating lower quality. However, because for a given hour within a day, unanswered calls would always represent the same percentage, from day-to-day, this offset would be normalized out and carriers are able to monitor the trend of ASR and treat it as a relative measurement

Network Efficiency Ratio was designed to eliminate user behavior as a factor and better represent pure network performance.
Network Efficiency Ratio [NER] is defined as:

User Answers or Normal call clearing      -       Cause code: 16
+ User Busy                       -       Cause code: 17
+ Ring No Answer               -       Cause code: 18 & 19
+ Terminal Rejects)            -       Cause code: 21
NER = -------------------------------------------------------x 100
     (Total # of Call Attempts i.e. IAM’s)

SIP is a more flexible protocol and has wider uses than simple call control. Therefore, use cases differ widely and systems such as voicemail and call forwarding skew expected behavior for answered calls

Furthermore, SIP is a more flexible protocol and has wider range of response messages which can be used to specifically indicate certain types of failures from either servers, network devices or the movement or absence or other behavior of users. Accordingly, IETF has defined KPI's equivalent to Answer Seize Ratio and Network Efficiency Ratio. These are described under SIP End-to-End Performance Metrics draft ietf-pmol-sip-perf-metrics-04. For example, the equivalent of ASR is Session Establishment Ratio (SER)

Session Establishment Ratio (SER) is defined as follows, to quote the IETF

   “This metric is used to detect the ability of a terminating UA or
   downstream proxy to successfully establish sessions per new session
   INVITE requests.  SER is defined as the number of new session INVITE
   requests resulting in a 200 OK response, to the total number of
   attempted INVITE requests less INVITE requests resulting in a 3XX
   response.  This metric is similar to Answer Seizure Ratio (ASR)”


The SER is calculated using the following formula:


                 # of INVITE Requests w/ associated 200 OK
  SER = --------------------------------------------------------- x 100
     (Total # of INVITE Requests)-(# of INVITE Requests w/ 3XX Response)

Here is the message flow which defines the basic SER. if the session INVITE request results in an interim response, such as a 302 Redirect response, this should be subtracted from the denominator.


                           UA1                 UA2
                            |                   |
                            |INVITE             |
               +----------->|------------------>|
               |            |                180|
               |            |<------------------|
      Session Established   |                   |
               |            |                   |
               |            |                200|
               +----------->|<------------------|
                            |                   |


In SS7, the fate of the call is determined by the RELEASE message which ends the call. The fate of a SIP call is determined by the Response to each Request for each transaction within the SIP call. For example as above, session establishment is determined by a successful outcome from the call session establishment phase

The SIP Peering KPI’s RFC 6076 also defines Session Establishment Effectiveness Ratio (SEER) which is similar to Network Efficiency Ratio [NER] in the SS7 ISUP circuit switched world.

This metric is complimentary to SER, but excludes the effects of the terminating UAS or endpoint i.e. it excludes user behavior from the metric and therefore more closely reflects the performance of the network.  SEER is defined as the number of INVITE requests resulting in a 200 OK response and INVITE requests resulting in a 480, 486 (Busy Here i.e. that endpoint is busy), or 600 (Busy Everywhere; i.e. the interconnecting network is busy or congested or down) to the total number of INVITE attempts less the 3xx, interim responses.

In order to simplify the formula, the following variable ‘a’ is used to summarize multiple SIP responses:

   a = 3XX, 401, 402, and 407

The SEER is calculated using the following formula:

             # of INVITE Requests w/ associated 200 OK, 480, 486, or 600
SEER = -------------------------------------------------------- x 100
            (Total # of INVITE Requests)-(# of INVITE Requests w/ 'a' Response)


 SIP response codes
  • 2xx—Successful Responses
    • Eg 200 OK
  • 4xx—Client Failure Responses
    • 401 Unauthorized (Used only by registrars or user agents. Proxies should use proxy authorization 407)
    • 402 Payment Required (Reserved for future use)
    • 480 Temporarily Unavailable
    • 486 Busy Here
  • 6xx—Global Failure Responses
    • 600 Busy Everywhere
    • 603 ADD


Another very useful measurement defined by the IETF here is Session Defects Ratio (SDR), also graphed by Palladion by checking a box. ‘503’ SIP Response messages commonly indicate a route to one of your peering partners is failing due to congestion of the Gateway or SoftSwitch.

Session Defects Ratio (SDR) is the percentage of call attempts receiving the following responses, in relation to total call attempts or INVITES:

o    500 Server Internal Error
o    503 Service Unavailable
o    504 Server Timeout
The SDR is calculated using the following formula:


  

                      # of INVITE Requests w/ associated 500, 503, or 504
     SDR = ----------------------------------------------------- x 100
                             Total # of INVITE Requests

In addition to this, some carriers like to plot Post Dial Delay, [PDD] and to generate an alert when this measurement exceeds a certain time threshold. PDD is defined as the time interval between transmission of the INVITE and reception of the ‘180 Ringing’ response message. The RFC prefers to define Successful Session Setup [SRD], the SIP equivalent of Post-Selection Delay (defined in E.721). This is an early indication of congestion soon to or about to occur on a given route and is a very useful KPI. Palladion can be configured to provide an SNMP trap or other notification to the Network Operations Center [NOC] of this imminent congestion so preemptive action can be taken.

Here is a table listing all the new RFC 6076 KPI’s with their SS7 or circuit switched equivalent:


SIP Peering KPI’s
RFC 6076
SIP Definition
ISUP Equivalent
ISUP Definition
Registration Request Delay (RRD)
Time of final response – time of REG attempt
N/A
Note *

Ineffective Registration Attempts
# IRA/total REG attempts
N/A

Session Request Delay (SRD)
Time of response – Time of INVITE               


Successful Session Setup [SRD]
Time of response – Time of INVITE                eg  INVITE            to      180
Post-Selection Delay (defined in E.721)
IAM to ACM (or ALERTING)
Failed Session Setup  [SRD] and

INVITE to response indicating failure eg 4XX (excluding 401, 402, and 407 non-failure challenge response codes), 5XX, or 6XX message.
N/A
IAM to REL with a cause code indicating a failure
Successful Session Disconnect Delay [SDD]
BYE, to 2XX Ack
Time to Clear a good Call or CIC
REL to RLC
Failed Session Disconnect Delay [SDD]
BYE, to Timer F Expires 
essage
Missing RLC i.e. expiry of ISUP T1 timer
Session Duration Time [SDT]
Time of BYE or time out – Time of 200 OK
Time of REL - Time of ANS
Average Call Hold Time (ACHT)
Successful Session Duration Time
200 OK response to an INVITE to BYE
See note below *

Failed session duration SDT
200 OK response to an INVITE, and the resulting Timer F expiration.
“awaiting Answer timer”
SS7 ISUP timer T9 (no response to ACM)
Session Establishment [SER ] Ratio
  # of “good call” INVITEs
---------------------------------- X %
Total INVITEs – interim 3xx Responses
Calls which connect
Calls which fail

Answer Seize Ratio (ASR)
Session Establishment Effectiveness Ratio (SEER)
(# of INVITE Requests w/ associated 200 OK, 480, 486, or 600)
------------------------------ X %
(Total # of INVITE Requests)-(# of INVITE Requests w/ 'a' Response)
NER = Answers (cc 16)
+UserBusy-cc 17
+ Ring No Answer (cc 18, 19 & 21)
+ Terminal Rejects)
_____________
Total call attempts
Network Efficiency Ratio (NER)
ITU E.411.
Ineffective Session Attempts (ISA)
Invites resulting in:-
o  408 Request Timeout
o  500 Server Internal Error
o  503 Service Unavailable
o  504 Server Timeout
         

# of ISA x 100
-----------------------------               Total # of Session Requests



Ineffective Machine Attempts (IMA) in telephony
   applications of SIP, and was adopted from Telcordia GR-512-CORE
   [GR-512].
Session Completion Ratio (SCR)
(a Session Completion is any SIP dialog that receives a valid response)

# of Successfully Completed Sessions x 100
------------------------------------------
Total # of Session Requests

Call Completion Ratio (CCR)
 




Notes:
·         * REGISTRATIONs are rare across SIP peering points as these interconnects are typically between two trusted environments. In addition SBC’s are used to protect the sanctity of each network. Similarly an interconnect between two SS7 networks is a trusted and secure interface, so accordingly, no equivalent of REGISTRATIONs exists in SS7, although the MTP3 protocol is used to establish routing between Signaling Point Codes and ISUP control messages to set up voice trunks
·         ** Monitoring for short duration Successful calls in a circuit switched network has not been as important and for a VoIP network as speech quality is not so much of an issue in circuit switched networks. In a VoIP network, short calls are important to monitor because if speech quality is so bad, the callers hang up immediately and call back, ….hopefully. 

References

·         Basic Telephony SIP End-to-End Performance Metrics Request for Comments: 6076
·         ITU-T Recommendation E.411 : International network management - Operational guidance
·         ITU-T Recommendation E.422 : Observations on international outgoing telephone calls for quality of service
·         Telcordia GR-512-CORE  - LSSGR: Reliability, Section 12