Mind
game smooths streaming audio
By
Kimberly Patch,
Technology Research News
Listening to streaming audio over the Internet
can be an uneven experience. This is because the Internet's bandwidth,
which determines how much information can flow at any given moment, is
variable. When the lines get crowded, the bandwidth available to a given
user can drop below the level formats like MP3 require to stream, or transmit
information in real-time. This causes the signal to cut out.
Researchers from the University of Washington are exploiting a quirk of
human hearing in an attempt to keep the sound streaming when bandwidth
gets pinched. The technology may also eventually be used to fingerprint
sound files, said Les Atlas, a professor of electrical engineering at
the University of Washington.
The human auditory system prioritizes sounds according to their duration,
putting more weight on the longer, slower aspects of a sound. The researchers'
sound algorithm, termed fine-grained scalable audio encoding, also gives
more weight to these longer, slower portions of a sound, and keeps them
going when the bandwidth gets crowded.
The algorithm also encodes sounds in a new way, allowing for smaller files
in the first place, Atlas said.
Sounds consist of different frequencies, or tones, and our inner ears,
as well as sound encoding techniques like MP3, sort sounds into frequencies,
said Atlas. However, "breaking complex sounds like music up into
frequencies is not the whole story," he said.
Another component of sound is modulation frequency, which is how the frequencies
change as the sound changes from one note to another, said Atlas. "It's
how these frequencies change over time which conveys the message in the
music. It's... how the violin changes from note to note combined with
the detailed changes of each note," he said.
The researchers focused on representing these modulation frequencies efficiently,
which allowed them to compress a given sound into a smaller file or, conversely,
increase the sound quality of a given file size.
At the same time, they put a higher transmission priority on the slow
modulations that are more important to human hearing than the fast ones.
This scalable representation is what allows the music file to continue
streaming, albeit at a lower quality, when bandwidth gets squeezed.
"This priority order is somewhat analogous to progressive transmission
of images, where a blurry image is sent first and more detail follows
later. However, sound is different. Once you hear the sound it's gone.
So by progressive transmission of sound we... mean that the sound quality
is so-so for the first second or so and then gets better. It'll always
be doing the best it can with whatever bandwidth is available," said Atlas.
The scheme is not radically different from other data compression schemes,
which all aim to reduce the amount of space, or bits, a given amount of
information takes up in order to send it more quickly and store it more
efficiently, said Atlas. "It's not like we designed a totally new audio
compression approach. It's more like we discovered a key new internal
function which... affects other more conventional functions, usually in
some bit-reducing ways," he said.
The sound compression technique scales on the fly, making it practical
for many types of users who may have different bandwidths at their disposal,
said Atlas. "Only one stream for all possible data rates needs to be sent
by the broadcaster," he said.
In the experimenters' informal tests, which polled 25 blind testers, the
sound quality compared well to CD and MP3 recordings, according to Atlas.
In comparisons of sound files compressed with the researchers' fine-grained
scalable technique streaming at 32 kilobits per second with MP3 files
streaming at 56 kilobits per second, 60 percent said the researchers'
files sounded better, 22 percent picked the MP3 files and 17 percent said
the two samples sounded the same.
The researchers could not do a comparison with the two types of files
both running at 32 kilobits per second because MP3 files cannot stream
that slowly, Atlas added.
In comparing a mono version of the researchers' 32-kilobit-per-second
sound with mono CD sound, 56 percent of listeners preferred the CD, 31
percent of listeners said the two files sounded the same, and 14 percent
said the researchers' technique sounded better, Atlas said.
Today's standard dial-up connection to the Internet provides a top speed
of 56 kilobits per second, but the actual speed of data can slow down
depending on factors like a busy host computer serving up the music or
clogged communications lines between the server and the end-user's computer.
The big question for the technique is how low it can go. According to
Atlas, the eventual goal is to transmit true CD quality at 64 kilobits
per second and listenable quality at 8 kilobits per second.
The scheme is an interesting idea supported by good experimental results
and worth further investigation, said C.C. Jay Kuo, a professor of electrical
engineering and mathematics at the University of Southern California.
"The main advantage is the demonstrated coding performance and its scalable
transmission property. The disadvantage is a slightly higher complexity
compared to today's codec," he said. Codec is short for coder-decoder.
In theory, modulation frequency information could also be used to improve
sound fingerprinting methods, said Atlas. Currently fingerprinting audio
means breaking the signal into different audio frequencies, he said. For
example "a song where someone is hitting a cymbal a lot... will have
more higher frequencies than another song with a lot of bass drum."
But this is not a very fine-grained method of identifying a song, he said.
"The balance of frequency content for a song that sounds different to
you and me might not look that different." The researchers are looking
into adding modulation frequency information to current methods to provide
more fine-grained audio fingerprinting, he said.
The modulation frequency technique could be ready for testing within two
years, said Atlas "I expect that most of the benefit of our technique
will be developed by five years from now," he added.
Atlas's research colleague was Mark S. Vinton. They presented the research
at the IEEE International Conference on Acoustics, Speech and Signal Processing
(ICASSP), May 7-11, 2001 in Salt Lake City. The research was funded by
the Office of Naval Research (ONR).
Timeline: 5 years
Funding: Government
TRN Categories: Data Representation and Simulation; Internet
Story Type: News
Related Elements: Technical paper, "A Scalable and Progressive
Audio Codec," presented at the IEEE International Conference on Acoustics,
Speech in Signal Processing (ICASSP), May 7-11, 2001, Salt Lake City.
Advertisements:
|
August
15, 2001
Page
One
Atom lasers made easy
Molecule makes mini memory
Does heavy volume
smooth Net traffic?
Mind game smooths
streaming audio
Quantum effect
for chipmaking confirmed
News:
Research News Roundup
Research Watch blog
Features:
View from the High Ground Q&A
How It Works
RSS Feeds:
News | Blog
| Books
Ad links:
Buy an ad link
Advertisements:
|
|
|
|