Network tools handle hefty science files

By Eric Smalley, Technology Research News

Modeling the weather, cataloging stars and smashing subatomic particles into each other are scientific experiments that require fast computers and generate vast amounts of data.

Sharing those resources and data is far beyond the capabilities of the everyday Internet, as anyone who has waited for a single photograph to download can attest. But the portion of the Internet that connects research institutions contains many high-speed links.

Researchers based at the Argonne National Laboratory are forging a high-speed network using these links. The network, dubbed the Data Grid, allows physicists in Switzerland, for example, to share raw data from their experiments with scientists in Japan and the United States. The researchers have bolstered Data Grid with a souped-up version of the File Transfer Protocol (FTP) and software for keeping track of copies of files.

"We want to support applications that require community access to, and analysis of, large amounts of data," said Ian Foster, a senior scientist and associate director in the Mathematics and Computer Science Division at Argonne National Laboratory and a computer science professor at the University of Chicago.

Scientists frequently copy large sets of data to computers around the Internet in order to share the information as well as scarce high-performance computing resources.

For a sense of scale, the researchers note that the Large Hadron Collider at the European Council for Nuclear Research (CERN) is expected to produce several petabytes of data each year for about 15 years starting in 2005. A petabyte is one million gigabytes, or the equivalent of about a billion 500-page textbooks.

"The Grid builds on the services provided by the Internet, providing additional services that enable the controlled sharing and coordinated use of distributed resources," said Foster. The Grid currently connects thousands of computers and supports hundreds of researchers. "Large projects starting now will scale this up by one or two orders of magnitude," he said.

The GridFTP and Grid replica management services tools will enable "higher-level services that support automated and scheduled data replication, automatic replica selection, and intelligent scheduling of data analyses based on information about data location and system status," said Foster.

GridFTP improves on FTP, the venerable file transfer tool of the Internet, by adding support for parallel data transfer, striped data transfer and transferring arbitrary subsets of files. Parallel data transfer speeds downloads and uploads by allowing multiple FTP streams over a single wide-area data link. Striped data transfer allows scientists to transfer files that have been striped, or saved across multiple servers.

The Grid replica management services centralize data organization by allowing scientists to post copies of data at various locations on the network and register them in a catalog. Scientists can query the catalog and select the copy they can download fastest based on network and storage system performance.

"These tools in themselves are only part of the overall puzzle, but the entire Data Grid infrastructure is what is going to enable the creation of the cyber infrastructure that will support 21st century science," Foster said.

Foster's research colleagues were Bill Allcock, Joe Bester, John Bresnahan, Sam Meder, Veronika Nefedova, Darcy Quesnel and Steven Tuecke of Argonne National Laboratory, and Ann L. Chervenak and Carl Kesselman of the University of Southern California. The researchers' work is scheduled to appear in the proceedings of the 18th IEEE Symposium on Mass Storage Systems. The research was funded by Department of Energy and the National Science Foundation.

Timeline:   Now
Funding:   Government
TRN Categories:   Networking; Data Storage Technology; Internet
Story Type:   News
Related Elements:  Technical paper, "Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing," scheduled to appear in the proceedings of the 18th IEEE Symposium on Mass Storage Systems


April 25, 2001

Page One

Shaped waves promise speed

Touch improves VR collaboration

Manners matter for the circuit-minded

Network tools handle hefty science files

Linked liquid crystals move matter


Research News Roundup
Research Watch blog

View from the High Ground Q&A
How It Works

RSS Feeds:
News  | Blog  | Books 

Ad links:
Buy an ad link


Ad links: Clear History

Buy an ad link

Home     Archive     Resources    Feeds     Offline Publications     Glossary
TRN Finder     Research Dir.    Events Dir.      Researchers     Bookshelf
   Contribute      Under Development     T-shirts etc.     Classifieds
Forum    Comments    Feedback     About TRN

© Copyright Technology Research News, LLC 2000-2006. All rights reserved.