New Internet Protocol sets Milestone for Fast and Friendly
Trans-Atlantic Data Transport
October 10, 2003
Chicago, Illinois. A new milestone was reached in trans-Atlantic
data transmission today by researchers at the University of
Illinois at Chicago (UIC) who demonstrated the practicality of
transferring even very large data sets over high-speed production
networks.
UIC's National Center for Data Mining (NCDM) and Laboratory for Advanced
Computing moved a data set of astronomical data across the Atlantic
at 6.8 gigabits per second --- 6800 times faster than the 1 megabit
per second effective speed that connects most companies to the
internet.
In the test, 1.4 terabytes of astronomical data was transmitted
from Amsterdam to Chicago in 30 minutes using a new protocol
developed by the NCDM at the University of Illinois at Chicago
called UDT. In comparison, moving the same amount of data across
the Atlantic with standard dial up internet connections would take
approximately 25 days.
Moving large data sets over the internet faces several hurdles:
First, the network infrastructure for long distance 1 Gigabit per
second and 10 Gigabit per second network links is still maturing
and software that can use this infrastructure is just being
developed. The UIC computer clusters used for the test were
connected to the SURFnet network in Amsterdam and the Abilene
network in Chicago. The test also demonstrated the quality and
power of these two of the world's leading research networks. In
the past, high-speed data transfers of very large data sets have
usually employed specialized experimental networks and used data
protocols that did not allow other network traffic to share the
same link.
Second, today's predominant network protocol, TCP, as it is
normally deployed is not effective at moving massive data over
long distances. UDP, another network protocol that is also
widely deployed, cannot reliably transport data (some data may be
lost) and is not friendly to other flows (using it for large data
transfers can starve other network traffic). Currently, efforts
are underway to improve TCP, to develop new protocols to replace
TCP, and to develop protocols on top of TCP and UDP that are
effective for high performance data transport.
To overcome these problems, in the past, high speed data transfers
of very large data sets have used special purpose research
networks and employed specialized data protocols that in practice
did not allow other network traffic to share the same link.
Friday's test run used a new network protocol called UDP-based
Data Transport or UDT, which was developed by the National Center
for Data Mining at the University of Illinois at Chicago. Unlike
some other protocols now being studied for high speed data
transfer, UDP-based protocols can be used over today's Internet
without making changes to the network infrastructure. Today's
demonstration not only showed that UDT was fast, but also that it
was friendly and could effectively coexist with thousands of
other networks connections.
The demonstration is part of an ongoing international effort to
find and test new ways of reliably moving massive data sets
around the globe using advanced networks and new data transfer
protocols. Such systems hold enormous promise for advancing
scientific research, in addition to numerous commercial
applications. Today, although it is becoming common for global
business to have important data in different cities, it is still
quite difficult to integrate this data to create a common view.
"Using UDT, it is now practical for the first time to move even
massive data sets over very long distances in a friendly fashion
using today's networks," said Robert Grossman, Director of UIC's
National Center for Data Mining and President of Open Data
Partners.
UDT is currently being used by several international research
projects. UDT is used by the OptIPuter, a research project
developing next generation computing infrastructures based upon
advanced photonics. UDT also plays a role in research projects
developing high performance web services, something that is
required in order to scale today's web services to large remote
and distributed data sets.
UDT is used as the network transport layer in the joint
University of Illinois/Northwestern project on Photonic Data
Services (PDS), which is developing open source data services for
next generation photonic networks, such as the OptIPuter. The
OptIPuter is an example of what are sometimes called lambda grids,
distributed computing infrastructures in which applications can set
up their own photonic paths (lambdas) supporting data transport
at Gigabit per second speeds and higher.
"Moving data at 6.8 Gigabits per second across the Atlantic using
UDT is an important milestone for the OptIPuter Project and
brings us a bit closer to effective data management over lambda
grids," said Larry Smarr, Principal Investigator of the OptIPuter Project
and Director of the California Institute for Telecommunications
and Information Technology, a UC San Diego/ UC Irvine partnership.
UDT is also being used as one of the layers of a UIC project
called Open DMIX (for Data Mining, Data Integration, and Data
Exploration), which is developing open source high performance web
services for data mining.
"Using UDT and the scalable data mining and data integration web
services built on top of it may emerge as an important enabling
technology for the grid computing required for next generation
virtual observatories," according to Alex Szalay, Alumni
Centennial Professor in the Department of Physics and Astronomy at
The Johns Hopkins University.
The tests were made possible by support from the following
manufacturers and organizations, who have generously contributed
their equipment, facilities, and know-how: OMNInet, StarLight,
Nortel, SARA and CANARIE. Partial funding for the tests was
provided by the National Science Foundation (Grants 0129609,
9977868 and 0225642) and the University of Illinois at Chicago.
For more information, contact:
Shirley Connelly, Associate Director, NCDM
312 413 2176, connelly at uic dot edu.
Robert Grossman Director, NCDM
312 413 2176, grossman at uic dot edu.
National Center for Data Mining
The National Center for Data Mining (NCDM) at the University of
Illinois at Chicago (UIC) was established in 1998 to serve as a
national resource for high performance and distributed data
mining. The Center sponsors research projects, facilitates
standards, operates testbeds, and provides outreach. The Center
is coordinating the development of the Predictive Model Markup
Language (PMML), the standard for statistical and data mining
models, as well as the WS-DMX web services for data mining and
data exploration standard. The NCDM also operates the Terra Wide
Data Mining Testbed, a worldwide testbed for high performance and
distributed data mining. For more information about NCDM, see
www.ncdm.uic.edu.
SURFnet
SURFnet operates and innovates the national research network in
The Netherlands, to which 150 institutions in higher education
and research in the Netherlands are connected. To remain in the
lead SURFnet puts in a sustained effort to improve the
infrastructure and to develop new applications to give users
faster and better access to new Internet services. Currently
SURFnet's network innovation is funded by the Dutch government
via the GigaPort project. For more information please visit
www.surfnet.nl.
About the OptIPuter
The OptIPuter, started in October 2002, is a five-year, $13.5
million project funded by the National Science Foundation. It
will enable scientists who are generating massive amounts of data
to interactively visualize, analyze and correlate their data from
multiple storage sites connected to optical networks. University
of California, San Diego and University of Illinois at Chicago
lead the research team, with funded partners at Northwestern
University, San Diego State University, the Information Sciences
Institute at University of Southern California, UC Irvine and
Texas A&M University, with industrial partners IBM, Sun
Microsystems, Telcordia Technologies, Inc. and Chiaro Networks.
See www.optiputer.net.
|