Mind game smooths streaming audio

By Kimberly Patch, Technology Research News

Listening to streaming audio over the Internet can be an uneven experience. This is because the Internet's bandwidth, which determines how much information can flow at any given moment, is variable. When the lines get crowded, the bandwidth available to a given user can drop below the level formats like MP3 require to stream, or transmit information in real-time. This causes the signal to cut out.

Researchers from the University of Washington are exploiting a quirk of human hearing in an attempt to keep the sound streaming when bandwidth gets pinched. The technology may also eventually be used to fingerprint sound files, said Les Atlas, a professor of electrical engineering at the University of Washington.

The human auditory system prioritizes sounds according to their duration, putting more weight on the longer, slower aspects of a sound. The researchers' sound algorithm, termed fine-grained scalable audio encoding, also gives more weight to these longer, slower portions of a sound, and keeps them going when the bandwidth gets crowded.

The algorithm also encodes sounds in a new way, allowing for smaller files in the first place, Atlas said.

Sounds consist of different frequencies, or tones, and our inner ears, as well as sound encoding techniques like MP3, sort sounds into frequencies, said Atlas. However, "breaking complex sounds like music up into frequencies is not the whole story," he said.

Another component of sound is modulation frequency, which is how the frequencies change as the sound changes from one note to another, said Atlas. "It's how these frequencies change over time which conveys the message in the music. It's... how the violin changes from note to note combined with the detailed changes of each note," he said.

The researchers focused on representing these modulation frequencies efficiently, which allowed them to compress a given sound into a smaller file or, conversely, increase the sound quality of a given file size.

At the same time, they put a higher transmission priority on the slow modulations that are more important to human hearing than the fast ones. This scalable representation is what allows the music file to continue streaming, albeit at a lower quality, when bandwidth gets squeezed.

"This priority order is somewhat analogous to progressive transmission of images, where a blurry image is sent first and more detail follows later. However, sound is different. Once you hear the sound it's gone. So by progressive transmission of sound we... mean that the sound quality is so-so for the first second or so and then gets better. It'll always be doing the best it can with whatever bandwidth is available," said Atlas.

The scheme is not radically different from other data compression schemes, which all aim to reduce the amount of space, or bits, a given amount of information takes up in order to send it more quickly and store it more efficiently, said Atlas. "It's not like we designed a totally new audio compression approach. It's more like we discovered a key new internal function which... affects other more conventional functions, usually in some bit-reducing ways," he said.

The sound compression technique scales on the fly, making it practical for many types of users who may have different bandwidths at their disposal, said Atlas. "Only one stream for all possible data rates needs to be sent by the broadcaster," he said.

In the experimenters' informal tests, which polled 25 blind testers, the sound quality compared well to CD and MP3 recordings, according to Atlas.

In comparisons of sound files compressed with the researchers' fine-grained scalable technique streaming at 32 kilobits per second with MP3 files streaming at 56 kilobits per second, 60 percent said the researchers' files sounded better, 22 percent picked the MP3 files and 17 percent said the two samples sounded the same.

The researchers could not do a comparison with the two types of files both running at 32 kilobits per second because MP3 files cannot stream that slowly, Atlas added.

In comparing a mono version of the researchers' 32-kilobit-per-second sound with mono CD sound, 56 percent of listeners preferred the CD, 31 percent of listeners said the two files sounded the same, and 14 percent said the researchers' technique sounded better, Atlas said.

Today's standard dial-up connection to the Internet provides a top speed of 56 kilobits per second, but the actual speed of data can slow down depending on factors like a busy host computer serving up the music or clogged communications lines between the server and the end-user's computer.

The big question for the technique is how low it can go. According to Atlas, the eventual goal is to transmit true CD quality at 64 kilobits per second and listenable quality at 8 kilobits per second.

The scheme is an interesting idea supported by good experimental results and worth further investigation, said C.C. Jay Kuo, a professor of electrical engineering and mathematics at the University of Southern California. "The main advantage is the demonstrated coding performance and its scalable transmission property. The disadvantage is a slightly higher complexity compared to today's codec," he said. Codec is short for coder-decoder.

In theory, modulation frequency information could also be used to improve sound fingerprinting methods, said Atlas. Currently fingerprinting audio means breaking the signal into different audio frequencies, he said. For example "a song where someone is hitting a cymbal a lot... will have more higher frequencies than another song with a lot of bass drum."

But this is not a very fine-grained method of identifying a song, he said. "The balance of frequency content for a song that sounds different to you and me might not look that different." The researchers are looking into adding modulation frequency information to current methods to provide more fine-grained audio fingerprinting, he said.

The modulation frequency technique could be ready for testing within two years, said Atlas "I expect that most of the benefit of our technique will be developed by five years from now," he added.

Atlas's research colleague was Mark S. Vinton. They presented the research at the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 7-11, 2001 in Salt Lake City. The research was funded by the Office of Naval Research (ONR).

Timeline:   5 years
Funding:   Government
TRN Categories:   Data Representation and Simulation; Internet
Story Type:   News
Related Elements:  Technical paper, "A Scalable and Progressive Audio Codec," presented at the IEEE International Conference on Acoustics, Speech in Signal Processing (ICASSP), May 7-11, 2001, Salt Lake City.




Advertisements:



August 15, 2001

Page One

Atom lasers made easy

Molecule makes mini memory

Does heavy volume smooth Net traffic?

Mind game smooths streaming audio

Quantum effect for chipmaking confirmed

News:

Research News Roundup
Research Watch blog

Features:
View from the High Ground Q&A
How It Works

RSS Feeds:
News  | Blog  | Books 



Ad links:
Buy an ad link

Advertisements:







Ad links: Clear History

Buy an ad link

 
Home     Archive     Resources    Feeds     Offline Publications     Glossary
TRN Finder     Research Dir.    Events Dir.      Researchers     Bookshelf
   Contribute      Under Development     T-shirts etc.     Classifieds
Forum    Comments    Feedback     About TRN


© Copyright Technology Research News, LLC 2000-2006. All rights reserved.