Why I bought $500 phone and my friend’s voice still sounds like a mouse’s
Have you ever tried to compare the quality of your conversation between Skype-to-Skype environment and your traditional phone?
If you have not I suggest you to give it a try. You will discover Skype-to-Skype conversation have significantly more clarity than your traditional phone. A little bit of research will show you that your phone call quality has not change much since 1930s. You will probably have the difficulty distinguishing between syllables such as “s”, “f”, “c”, “e” and “d”. We are so use to those inefficiencies to the extend that when somebody ask us “What’s your name?”, we usually answer in the following manner “”S” like “sandwich”, “t” like “tiny”, “e” like “elephant” and so on” (usually in some kind of phone transaction situation). We don’t even notice it. It is pure habit. What seems to be a bigger problem is the process of “guessing”, which some scientists
refer to “brain fatigue”. “Are we “c”eiling this week?” or “Are we “f”ailing this week?” – the difference between “ceiling” and “failing” is a decision outsourced to our subconscious mind. If you spend a few hours a day on the phone, imagine how much work your subconscious mind needs to do.
So I guess many of you will ask “Why with so much technology nowadays we end up with so poor quality of our calls?” In order to answer that question I need to go a little bit into the field of science. I will try to keep it simple and use it just as a comparison. In the beginning of the 20th century the Public Switched Telephone Network (PSTN) was created. It was designed to carry frequencies up 300 Hz or 3 kHz. (Just for comparison – the human ear can hear sound in the range 0.02 kHz to 20 kHz; AM Radio can cover frequencies up to 5kHz, Television and FM Radio can cover up to 15 kHz. See the audio examples below). In order to hear clearly syllables such as “s” and “f” we need a frequency above 3.5 kHz. The limitation of 3 kHz was not imposed due to the capacity of the wire or the network, but more to the cost control of the end phone device. I think the idea was to eventually increase this limitation as the phone devices become more and more sophisticated. Unfortunately this never happened. The limitations of 3 kHz become globally accepted as PSTN standard. It was even established as standard in the digital telephony in G.711 (which is still currently the most common codec in Voice over IP(VoIP) World) since the begging of 1980s.
Thirty years later the initial standardization of digital telephony has not changed. There are few reasons for that and I’m sure you know them – regulation, patents and research and development (R&D). It is very costly to create a new codec. When you crate it you want to protect it by patent. Patent implementation creates obstacles, when it has to be implemented on large scale. And Regulation is not going to recommend the codec unless it has huge implementation.
So when I buy the latest Android phone or iPhone, should I expect to guess whether we are “ceiling” or “failing” during the call? – Probably you should unless the conversation is between iPhone and iPhone on the same network.
Now you’re expecting a pitch. I am not going to pitch high since it’s would be irrelevant. Although many VOIP Providers pride themselves with “HD Voice”, the probability of achieving it with VOIP Device is very negligible. The “HD” eco system is not there yet. What we’re seeing at Alliance Phones is a positive move with “open source” digital codecs. On regular basis we are testing different scenarios with our carries. The beauty of the technology of nowadays is that we can quickly implement any updates on our new and existing PBX systems and client phone systems attached to them.