Society goes on data binge

By Kimberly Patch, Technology Research News

A couple of Berkeley professors were counting as the world produced a year's worth of data last year, and they have concluded that we produced a staggering amount.

They looked at paper, film and optical and magnetic mediums and counted every kind of information they could find numbers on, including email, videos, DVDs, CDs, broadcast information, photographs, books and newspapers.

They concluded that as much as 1.5 million terabytes of unique information was created worldwide in 1999. That 1.5 million terabytes, or 1.5 million million megabytes (MBs) -- works out to 250 MB for every man, woman and child on earth. And keep in mind that the total doesn't include copies, just originals.

"I was surprised how much information there is," said Peter Lyman, a professor of information management at Berkeley. "It's just a staggering amount of information," he said. If you put 1.5 million terabytes of information on regular 1.4 megabyte floppy disks, the pile would stretch 250 million miles, Lyman said. That's enough to stretch from the earth to the sun more than two and a half times.

The researchers found that the world produced only slightly more information on paper and film in 1999 than the year before but doubled the amount of unique information stored on digital media.

The researchers found a couple of surprises in addition to the sheer volume of the information, said Lyman. "We were astounded at how much information is created by individuals and distributed on the Internet by individuals. We've gone to a world in which everybody has the capacity to create and distribute information and we are doing so," he said.

Another interesting trend was just how tiny a percentage of all information is stored in print, said Lyman. "It's far less than one percent."

Between 23 and 240 terabytes of originals were produced on paper in 1999, according to the research. Book, periodical, and office paper production were up about two percent from the year before while newspaper production shrank by about two percent.

At the same time, photograph, x-ray and cinema film accounted for between 58,216 and 427,216 terabytes of information, which was about 4 percent more than the year before, according to the research. About 300,000 terabytes of movies were recorded on camcorder tapes in 1999, up five percent. Unique information stored on CDs and DVDs was between 31 to 83 terabytes, or about 70 percent more than the year before. And original information stored on disk drives used between 335,660 and 1,393,000 terabytes, which was about twice as much as the year before.

One reason disk drive information is growing so rapidly is the growth of the Internet, said Lyman. The World Wide Web is made up of about 2.5 billion easily accessible documents and is growing at the rate of 7.3 million pages per day. The Web also harbors 548 billion documents in connected databases, intranets and dynamic pages, ninety-five percent of which is publicly available.

But email beats the Web hands down in raw information generation, accounting for about 500 times the information as web page production, according to the researchers. About 650 billion email messages were sent in 1999, with the average worker receiving about 40 emails at work each day.

Also surprising was the sheer number of photographs taken, said Lyman, who added that relatively few are copied, a number which could change as the trend toward digitized photographs grows.

The research showed that the United States produces much more data per person that the rest of the world -- a total of 35 percent of the world's print material, 40 percent of its images, and more than half of its digitally stored data. In addition, 78 percent of all Web sites and 96 percent of e-commerce sites are in English, while only 50 percent of all Internet users are native English speakers.

The most difficult part of the report, which the authors began working on in June, 2000, was figuring out how much of the information stored in digital form was original, as opposed to copies of information, said Lyman.

In addition, the authors had to decide how much to adjust the numbers based on compression. "The size of digital [information] depends upon compression. So you have to adjust for the kind of use that you ultimately intend to put information to, because you compress it to a standard consistent with use," said Lyman. The 1.5 million terabytes is somewhat compressed, because that's the way the information is stored. The uncompressed estimate is two million terabytes, and the fully compressed estimate is 800,000 terabytes, he said.

The authors found they could not compare their numbers to similar reports made in the '80s because "we didn't have a common measure," said Lyman, who pointed out that information now means bits, not words.

It was also difficult getting numbers for Third World countries, Lyman said.

Lyman and his research colleague Hal Varian have published their report, titled "How Much Information," on the Internet and are hoping to get help refining the numbers. "A lot of what we did was put methodologies out there. We're asking people to get back to us so we can revise the study. [We're hoping] people have ideas for better data sources and better ways to estimate. So the next stage is to get feedback so we can improve the model," said Lyman.

The researchers plan to publish a future report with the refined numbers, then use the numbers for further study of the information society produces. The study was funded by EMC Corp.

Timeline:   Now
Funding:   Corporate
TRN Categories:  Computers and Society
Story Type:   News
Related Elements:   Technical report "How Much Information," available at


October 25/November 1, 2000

Page One

Disk-on-a-chip takes shape

Shaky table top sorts parts

Forked nanotubes are tiny transistors

Society goes on data binge

Pulse harbors magnetic mystery


Research News Roundup
Research Watch blog

View from the High Ground Q&A
How It Works

RSS Feeds:
News  | Blog  | Books 

Ad links:
Buy an ad link


Ad links: Clear History

Buy an ad link

Home     Archive     Resources    Feeds     Offline Publications     Glossary
TRN Finder     Research Dir.    Events Dir.      Researchers     Bookshelf
   Contribute      Under Development     T-shirts etc.     Classifieds
Forum    Comments    Feedback     About TRN

© Copyright Technology Research News, LLC 2000-2006. All rights reserved.