Mean Opinion Score [MOS] is a scale from 1 to 5 indicating speech quality - 1 is bad and 5 is excellent. MOS test sessions comprise 15 to 25 people listening to speech files of good quality and of poor quality with impairments and scoring them subjectively. This subjective test process is specified in ITU-T P.800. In over 16 years where these tests have taken place, no statistically significant number of participants ever scored any speech recording as being excellent or 5.0. The highest score typically obtained in any test was 4.5.
High Definition or Wideband Telephony speech uses the new POLQA speech quality metric for objective protection of MOS. The old PESQ algorithm has been used for narrowband telephony since it was approved in 2000. It is desirable to use the same scale so that laboratories can compare new results for wideband telephony with their old PESQ database. However, the question of human expectation comes into play because all these objective measurements performed by computers must correlate or predict subjective experience. If you watch a video on your smart phone, you might consider the picture quality as being good. Your expectations are put in the context of the small screen and the convenience of the video being played on a handheld smartphone. If you would give you the same video on your brand-new expensive high-definition 1080P TV, you would be very disappointed even if the pixel resolution had been scaled to the 62 inches screen size. Your expectation of quality is tempered to the format in which you are viewing it.
Similarly with speech and audio. If you were to participate in a MOS test and invited into a studio where there were high fidelity speakers, orchestral classical music playing and told and asked to rate the quality of the High Definition speech you are about to hear, your expectations would be set high and you'd be more critical. You would score the audio lower than if you had been asked to rate the speech quality of your most recent cellular phone call.
POLQA offers two scales, the narrowband scale and the super wideband scale. Super wideband telephony reaches 14 kHz analog audio frequency. The narrowband focus scale maps directly onto the old desk scale and exploits the higher scores not given by test participants in narrowband tests.
• NB: Maximum MOS value 4.25
• WB: Maximum MOS value 4.5
• SWB: Maximum MOS value 4.75
So a score of 4.5, on the narrowband POLQA scale is experimentally the best value you will ever obtain with wideband telephony equipment. You could conceivably measure a MOS value of 4.75 if you were measuring super wideband equipment.
In future years, the industry will migrate exclusively to using the super wideband POLQA scale as soon as users' expectations always expect high-definition or hi-fi quality to the communications audio.
The picture shows the iLBC codec measured measuring 4.21 narrowband focus scale.
For more information on making PESQ and POLQA measurements, ensure you contact only renown and well-respected test vendors because the science of speech quality measurements requires expertise and experience in many different areas audio, analog electronics as well as computing. It is easy to make a measurement but care is required to ensure that measurement is accurate and correlates to human subjective experience
The most trusted vendor for speech quality metrics is Malden Electronics, available in USA through Teraquant Corporation – www.teraquant.com
See use in http://technorati.com - X84QTD2E9BS6