POLQA Perceptual Objective Listening Quality Assessment, also known as ITU-T Rec. P.863 is an ITU-T Standard that covers a model to predict speech quality by means of digital speech signal
POLQA covers a model to predict speech quality, by means of digital speech signal analysis. The predictions of those objective measures should come as close as possible to subjective quality scores as obtained in subjective listening tests. Usually, a Mean Opinion Score (MOS) is predicted. POLQA uses real speech as a test stimulus for assessing telephony networks.
POLQA is the successor of PESQ (ITU-T Rec. P.862). POLQA avoids weaknesses of the current P.862 model and is extended towards handling of higher bandwidth audio signals. Further improvements target the handling of time called signals and signals with many delay variations. Similarly to P.862, POLQA supports measurements in the common telephony band (300–3400 Hz), but in addition it has a second operational mode for assessing HD-Voice in wideband and super-wideband speech signals (50–14000 Hz). POLQA also targets the assessment of speech signals recorded acoustically by an artificial head with mouth and ear simulators.
The POLQA activities started in ITU-T in early 2006 under the working title POLQA. In mid-2009 a competition was started to evaluate several candidate models. In May 2010 ITU-T selected candidate models from three companies, OPTICOM, SwissQual a Rohde & Schwarz company, and TNO (Netherlands Organisation for Applied Scientific Research), to form the future Recommendation P.863. The three companies were asked to merge their approaches to one single standardized model. The result is now standardized as POLQA / P.863.
ITU-T’s family of full reference objective voice quality measurements started in 1997 with P.861 (PSQM), which was superseded by P.862 (PESQ) in 2001. P.862 was later complemented with the recommendations P.862.1 (mapping of PESQ scores to a MOS scale), P.862.2 (wideband measurements) and P.862.3 (application guide). Since 2011 P.863 (POLQA) is in force. Two additional implementer’s guides for P.863 have been consented by ITU-T Study Group 12 in November 2011. In addition to the above listed full reference methods, the list of ITU-T’s objective voice quality measurement standards also includes P.563 (no-reference algorithm).
POLQA, similar to P.862 PESQ, is a Full Reference (FR) algorithm that rates a degraded or processed speech signal in relation to the original signal. It compares each sample of the reference signal (talker side) to each corresponding sample of the degraded signal (listener side). Perceptual differences between both signals are scored as differences. The perceptual psycho-acoustic model is based on similar models of human perception as MP3 or AAC. Basically, the signals are analysed in the frequency domain (in critical bands) after applying masking functions. Unmasked differences between the two signal representations will be counted as distortions. Finally, the accumulated distortions in the speech file are mapped into a 1 to 5 quality scale as usual for MOS tests. FR measurements deliver the highest accuracy and repeatability but can only be applied for dedicated tests in live networks (e.g. drive test tools for mobile network benchmarks).