Net scan finds like-minded users
Technology Research News
When you search for information on the
Web, chances are you aren't alone -- there are like-minded groups of users
across the Web searching for the same sorts of things.
Researchers from the University of Chicago have shown that is
possible to identify these groups by analyzing browsing patterns, even
in networks as far-flung as the Web.
The researchers' method of graphing information across data distribution
systems like the Internet shows that, given a large enough sample, computer
users can be grouped according to their common interests based only on
their requests for data. "One of the first questions we asked was is the
group-based collaboration of scientists mirrored somehow in their usage
of data," said Adriana Iamnitchi, a researcher at the University of Chicago.
The answer turned out to be yes, across all types of group-based
interests. "Communities as heterogeneous as the Web seem to show this
pattern of having users naturally group in interest-based groups," she
The information-request graphing method can be used to design
scalable, adaptive methods for locating and delivering data, said Iamnitchi.
The method could theoretically be used by anyone, including ecommerce
vendors, to target communities of interest.
The researchers are working on using the patterns to design more
efficient services for resource-sharing environments like Grid computing,
Grid software coordinates a few or even hundreds of computers
across networks like the Internet to piece together compute power and
resources like databases into powerful virtual computers; the combined
resources can speed up scientific and engineering applications like time-consuming
equations and three-dimensional simulations.
The researchers found the data-sharing relationship pattern while
looking for a way to leverage characteristics of the Grid computing community
to make that type of computing more efficient, according to Iamnitchi.
"Our idea was to... design mechanisms [that are] able to cope efficiently
with large and dynamic numbers of resources -- data files, computers,
and storage space for results," she said.
One typical characteristic of the community that uses Grid computing
is they tend to collaborate, said Iamnitchi. When the researchers analyzed
traces of scientific computations from a high-energy physics collaboration
that spanned 18 countries and involved 70-odd institutions and thousands
of physicists, they found that the patterns of collaboration were mirrored
in scientists' data requests.
The researchers looked at the relationships that formed among
users based on the data they were interested in. "We captured and quantified
these relationships by modeling the system as a data-sharing ... graph
whose nodes are the data consumers in that system," said Iamnitchi. Nodes,
or people, who requested a given number of the same files within a given
time were connected.
In an analysis of six months worth of scientists' requests for
data, the researchers found that group-based collaboration is visible
in the way information is requested, said Iamnitchi. "Scientists form
groups of interest based on the data they used," she said. The researchers
found the same pattern in a larger analysis of general Web requests.
The pattern of similar requests shared the small-world characteristic
common in many networks, including the way data is arranged in networks
like the Internet.
In small-world networks, it is possible to get from one node to
any other node by traversing relatively few links. Social networks, with
people as nodes and relationships as links, and the Web, with pages as
nodes, and links between pages as links, are also small-world networks.
Looking at small-world topologies is not a novel idea, but the
method of extracting a graph from an arbitrary data-sharing relationship
and using it to study these structures is, said Filippo Menczer, an assistant
professor of management services at the University of Iowa.
Data request patterns have been analyzed previously, but in different
ways -- to examine the popularity distribution of Web requests or to study
the most efficient way to cache Internet traffic. In contrast, the Chicago
researchers' analysis uncovered relationships between users based on their
common interests in data.
The method is potentially useful, especially because a graph can
be made from any Web usage log, said Menczer. "Any Webmaster can do this."
The method may be useful for discovering clusters of users who
have interest in a certain type of data, Menczer said. "Ecommerce vendors
are currently using collaborative filtering techniques, which are related
to this," to do so, he said. The method can also be used for distributed
caching and broadcasting, similar to the services offered by Akamai Technologies
Inc., he said.
The researchers are now making the method more efficient for resource-sharing
environments like Grid computing, said Iamnitchi. "We are currently looking...
to design mechanisms to locate resources," she said. "The ultimate goal
is to provide scalable, adaptive mechanisms [that are] able to deal with
variations in resource participation."
The resource location mechanisms could be ready to use within
two years, Iamnitchi said.
Iamnitchi's research colleagues were Matei Ripeanu from the University
of Chicago and Ian Foster from Argonne National laboratory. The research
was funded by the National Science Foundation (NSF).
Timeline: 2 years
TRN Categories: Internet; Distributed Computing
Story Type: News
Related Elements: Technical paper, "Data-Sharing Relationships
in the Web," posted on the arXiv physics archive at arXiv.org/abs/cs.NI/0302016.
May 7/14, 2003
Screen arcs widen view
Light show makes 3D
Net scan finds like-minded
Sound forms virtual
Touchy-feely goes remote
Light mix makes
Metal expands electrically
fill virus with metal
Gold connectors stretch
Research News Roundup
Research Watch blog
View from the High Ground Q&A
How It Works
News | Blog
Buy an ad link