could snowball on Net
Ted Smalley Bowen,
Technology Research News
The Internet’s ability to connect a wide
range of cultures would seem to bode well for diversity of all sorts.
But, while the technology is relatively neutral, the influences of political
and economic power have made the Internet
a virtual English-language empire.
Researchers from the Tel Aviv University and the University of California
at Berkeley have teamed up to gauge the nature of the relationship between
linguistic patterns and Internet content.
Early returns from the work imply that English content will continue to
dominate the Internet, although other studies predict different scenarios.
Currently about 70 percent of Internet content is in English, but only
about 44 percent of Internet users are native English speakers. Worldwide,
native Spanish speakers outnumber native English speakers, and the number
of native Chinese speakers more than equals that of both groups. English
dominates online because it was established early on as the lingua franca
of the wired world.
The imbalance reflects a first-mover advantage that is common in networks
of all kinds, according to Neil Gandal, an associate professor of economics
at Tel Aviv University in Israel.
In this case, the language of Shakespeare, Mark Twain, H.L. Mencken, and
Yogi Berra benefits from the snowballing effect of a popular medium attracting
more users simply because it’s popular. The language's popularity spurs
more people to learn English, which increases incentives for content providers
to cater to an English-speaking audience, which in turn makes it all the
The researchers examined whether these first-mover effects dictate that
English will simply gain momentum and remain the primary online language,
prompting even more people to learn it, or whether the demographic and
economic realities of a polyglot world will turn the tide.
This question is especially pertinent because Internet use among non-native
English speakers is growing at a faster rate than that of native English
speakers. By 2003 only 29 percent of Web
users will be native English speakers, according to one estimate.
The researchers analyzed the surfing habits of a usefully bilingual population
-- Canadians in the province of Québec. As of 1996, roughly 5.7 million
Québec citizens counted French as their mother tongue, about 600,000 cited
English, and about 60,000 listed both.
The researchers looked at users’ overall time online and time spent at
each of seven types of sites: retail, business and finance; entertainment,
news, sports and technology; education; portals, searches and directories;
services, including ISPs, careers, and hobbies; government; and adult.
To get a rough breakdown by language of the content surfed, the researchers
wrote a spider program that identified the languages of the approximately
40,000 Quebecois URL domains
The researchers compared the overall Internet use of the three linguistic
camps by type of sites, regardless of the content language, and then looked
at which factors determined the percent of the time devoted to English
The native English speakers visited English content sites 87 percent of
the time and stayed online about 35 percent longer than their French-speaking
neighbors. The native French speakers, however, surfed in English a still
considerable 64 percent of the time.
The differences also narrowed with age: younger native French speakers
looked at more English content than their elders.
The finding that native French speakers are hurdling the linguistic barrier
and turning to English sites for content not available in French is evidence
that English's first-mover advantage is still snowballing, according to
Gandal. These network effects are likely to continue to favor creating
content in English and to lower incentives to do so in French, he said.
These preliminary results also indicate that the Internet is increasing
the incentive for non-native English speakers to learn English as a second
language, which could in turn promote English as a global language, according
In addition, although automatic translation technologies may eventually
break down linguistic barriers, they are currently too limited to be a
likely influence on the choice of content language, said Gandal. “Translation
is very difficult because of the subtlety involved in the use of language,"
Computer-generated translation does work well for finding simple information
like a train or airline schedule or the location of a particular office,
but does not convey more complicated communications like disease diagnosis
or an explanation of how to make a retail purchase, said Gandal. "We don’t
think that they will play a prominent role in the choice of language content
in the foreseeable future."
The issue of language representation on the Internet is a contentious
one, and is complicated by widespread financial stakes and cultural implications.
The researchers' conclusions contradict those of the Foundation for Networks
and Development, a private regional development organization in the Dominican
The current predominance of English on the Internet is largely due to
the network's American origins and because the first wave of users worldwide
is more likely to speak English as a second language, said Daniel Pimienta,
director of the Foundation.
The foundation's statistics show that this is changing, he said. For instance,
three years ago 75 percent of Web pages were in English, but that number
has dropped to 50 percent today. In addition, the number of English Web
pages as a percentage of the population of the world that speaks English
as a native or second language is falling relative to Spanish, French,
Italian and Portuguese, he said.
As the Internet's population becomes more diverse and an increasing percentage
of its users lack English skills, the early predominance of English will
continue to fade, he said. "As the Internet evolves toward a more balanced
geographical [distribution] and a more balanced socio-economic distribution,
the dominance of English will more and more appear as a transitional phenomenon
and the representation of language in the Net will tend to become closer
to the natural representation of the language in the world."
As this happens, however, English will retain a special role in bridging
communities whose native languages are different, he added. "This is and
will remain the case of English, but also of Spanish, French, Arabic and
Under this scenario, monolingual native English speakers may be more likely
to pick up another tongue, Pimienta said. "The Internet will probably
represent a strong asset for the language training industry to add a second
language to native English speakers."
The Tel Aviv and Berkeley team's choice of a mostly bilingual population
like Quebec's makes it harder to gauge the factors driving the choice
of language on the Internet, Pimienta said. That population is able to
navigate in English, while 90% of the world population does not understand
English, he said.
The Tel Aviv and Berkeley researchers are currently working on a model
designed to distinguish among cultural and economic factors driving the
spread of English and those effects specific to the Internet, Gandal said.
One goal is finding how closely the use of English online will hew to
the demographic and economic realities of English speakers. “The question
is whether the percent of Internet content in English will reflect...
or... greatly exceed the percentage of native English speakers around
the world, weighted by purchasing power,” said Gandal.
The researchers plan to delve into data for all of Canada in an effort
to quantify factors like the number of Internet pages read or transactions
conducted that would justify continued use of and investment in a particular
language, Gandal said. “The model will need to distinguish between adults
who find it harder to learn a new language... and children who find it
easier," and therefore get more out of the experience, he said.
The researchers' updated model will also help quantify the strong network
effects favoring development in English and drawing the best bells and
whistles to English sites which, at least initially, place non-English
sites at a disadvantage.
As more precise language identification software emerges, the researchers
will be better able to determine the breakdown of pages visited according
to content language, according to Gandal.
Gandal's research associate was Carl Shapiro of the University of California
at Berkeley. They presented the work last month at the Telecommunications
Policy Research Conference (TPRC) 29th Research Conference on Communication,
Information and Internet Policy in Alexandria, Virginia. The research
was funded by the UC Berkeley.
TRN Categories: Internet, linguistics
Story Type: News
Related Elements: Technical paper, “The effect of native
language on Internet usage”, Telecommunications Policy Research Conference
(TPRC) 29th Research Conference on Communication, Information and Internet
Policy, October 27-29, 2001, Alexandria, Virginia. >
Chemists create nano
English could snowball
Page age shapes Web
six degrees of separation
Spot of gold makes
Research News Roundup
Research Watch blog
View from the High Ground Q&A
How It Works
News | Blog
Buy an ad link