March 2010
Bionimbus
Bionimbus is a cloud-based system for managing, analyzing and sharing genomic data. For more information visit Bionimbus web site.
February 2010
SUMMER 2010 REU University of Illinois at Chicago (UIC)
Research Experience for Undergraduates in Computer Science
Sponsored by The National Center for Data Mining at UIC & The National Science
Foundation
The National Center for Data Mining (NCDM) at UIC is hosting a Research
Experience for Undergraduates (REU), Summer 2010.
Interactive Systems Biology App: Museums and Beyond Project
A team will develop an iPod Touch
or iPad interactive game, "Make a Fly." The app will combine transcriptional
network data with images of embryo/larval/pupae biology to create a dynamic 3-D
"Google-Earth" like semi-transparent information landscape. The learning
experience will be goal-based and spawn a successfully emerged adult Drosophila
mealangaster fly. The colorful app, after formative development and testing,
will run on an iPodTouch or iPad tablet in the Chicago Museum of Science and
Industry's Genetics: Decoding Life exhibition. The user will guide the
virtual animal's developmental pathway by moving the pod forward in three
dimensions to accelerate biological growth. Keeping the user's activities
aligned with normal regulatory events portrayed on the screen, in the absence
of bumping into chromosomes and mutating critical genes or altering important
temporal/spatial environmental signals, will result in winning the game.
Program Leaders: Dr. Robert L. Grossman (UIC) and Dr. Barry Aprison (UofC)
Dates: June 7 to August 2, 2010 (Shorter periods may be available)
Location: University of Illinois at Chicago in Chicago, Illinois
Applications received before May 1, 2010 will receive first consideration.
Eligibility and Prerequisites: Open to undergraduate students who are U.S.
citizens or permanent residents. Computer science prerequisites: Networking,
Operating Systems, C++, Python. (iPhone/iPod Touch OS 3.1; iPhone SDK 3.1.2;
iPad/iPhone SDK 3.2beta).
Support: Participants will receive a $250 a week stipend and a travel
subsidy of up to $400. Campus recreation facilities will be accessible to
students in the program. Dormitory housing may be available through the
program.
Please email your application form to: reu@ncdm.uic.edu
*Include "REU 2010 " in subject line of all emails*
Contact person:
Shirley Connelly
322 SEO, MC 249
851 South Morgan Street
Chicago, IL 60607
Tel: 312-413-2176
Fax: 312-355-0373
December 2009
Open Cloud Consortium Wins Bandwidth Challenge at SC09 Conference in Portland, Oregon.
Dec. 7, 2009 Portland, Oregon
At the International Conference for High Performance Computing, Networking, Storage,
and Analysis (SC09), a research team led by the Open Cloud Consortium won the Bandwidth
Challenge competition for new technology to support data intensive applications over wide
area clouds. In addition, the Open Cloud Consortium won in the Best Overall category of the
SC09 Bandwidth Challenge.
Full Announcement (pdf).
November 2009
UDX high performance protocol announced.
Nov. 17, 2009 Portland, Oregon
Today, at the International Conference for High Performance Computing,
Networking, Storage, and Analysis (SC09) conference, a research consortium
demonstrated a new high performance transport protocol, UDX, for data intensive
science applications. The UDX demonstration transported 9.3-9.6 Gbps streams
using US national 7,000 mile testbed, C-Wave, based on lightpath channels
within optical fiber.
Full Announcement (pdf).
SC09: NCDM wins the Bandwidth Challenge Competition.
The NCDM/iCAIR/NRL team demonstrated three applications to show efficient
bandwidth utilization in distributed data intensive applications. The first
demo is processing very large datasets over 256 servers in 4 data centers
connected by wide area high speed networks. The data analysis application
exchanages data at over 100Gb/s among participating nodes. This application
uses the open source software Sector/Sphere and UDT, developed by NCDM. The
second demo is a cloud based image rendering application that delivers very
high resolution visualization (computed by remote cloud systems) over long
distance infiniband and IPv6. A hardware implementation of UDT was deployed to
support the long distance infiniband protocol. The third demo showcased a light
weight UDT variant called UDX, which can transfer data at 9.xGb/s using a
single connection over a 10Gb/s network with 200ms RTT. Overall, our team
achieved 25Gb/sec sustained throughput over a 200ms RTT, 12,000 mile path
utilizing only seven servers at the SC09 floor.
Final Results of the SC09 Bandwidth Challenge:
Category - Rich
- manifold-process implementations including diverse mechanisms.
Winner: National Center for Data Mining/Univ of IL at Chicago
Category - Classic data movement.
Winner : Caltech
Category - Impact
- developments strongly affecting the target communities. Winner: University of Tokyo
Category - Overall
Winner: National Center for Data Mining/Univ of IL at Chicago
SC09: Sphere TeraSort performance visualization
NCDM developed Sphere platform demonstrated at SuperComputing
2009 conference running a TeraSort HPC benchmark. LAC Cluster Monitor
is used to visualize compute node utilization during the Sphere TeraSort
run. Each square represents a single node and it's color indicates system load.
SC09: Canopy visualization
NCDM developed virtual network management library Canopy, demonstrated at
SuperComputing 2009 conference. This demonstrations shows use of Canopy system
to switch between two sets of Web and SQL virtual machines. As Web and SQL servers
are alternated, database web interface updates visualize which resources are accessed.
All Canopy operations demonstrated took place on osi layer 2 and without signaling the
web and sql instances.
SC09: UDXnet BWC visualization
NCDM developed UDT high performance network protocol demonstrated across
a 12,000 mile network at SuperComputing 2009 conference. This demonstration
shows output of udx command line program, as it quickly scales up to over
8Gbps on the 12,000 mile network.
WTTW 11 Chicago presents a segement on Cloud Computing
'Chicago has become a world center of "cloud computing." As we continue our
Chicago Matters: Beyond Burnham series, Rich Samuels explains what "cloud
computing is and how you probably already use it on a daily basis.'
Video link.
Summer 2009 Undergraduate Research Opportunity in Computer Science
Sponsored by The National Center for Data Mining at UIC & The National Science
Foundation.
The National Center for Data Mining at UIC is hosting a Research Experience for Undergraduates (REU).
Students will work on research projects using clouds for high performance computing, for applications in
genomics and systems biology.
Information: reu-2009.pdf
Application: reu-2009-application.pdf
Sterling Commerce adopts UDT
Sterling Commerce, an AT&T Inc (NYSE: T) company, today announced Sterling File
Accelerator (SFA). SFA combines the power of the company's Connect:Direct
point-to-point file transfer software optimised for high-volume, secure,
assured delivery of files with a new UDP Data Transfer-based file transport
(UDT) - an application-level data transport protocol that overcomes the latency
issues associated with transmission control protocol (TCP)-based transmissions.
source: iTWire.com.
UDT was developed by National Center for Data Mining
December Talk
U.K. e-Science and Exploiting Research Data
Speaker: Dr. Malcolm Atkinson
Friday, December 5, 2008
1-2 pm Location: 636 SEO
ABSTRACT
In 2000 the U.K. coined the word "e-Science" for a long-established
research strategy: making the best use of advances in computing science
to enable new research methods. It recognized this as a two way dynamic
process and placed emphasis on advances in distributed computing and on
exploiting the opportunities delivered by the growing bonanza of data in
all fields of research. I will argue that this combination requires new
architectures and will discuss experiences of using data streaming
architectures in the OGSA-DAI product and the ADMIRE research project.
Short Biography:
Malcolm Atkinson is Director of the e-Science Institute. He is the UK
e-Science Envoy and plays a leading role in the Open Middleware
Infrastructure Institute UK, is on the advisory boards of the National
Grid Service, the National Centre for e-Social Science, and Baltic Grid.
He led the EU IST project 'International Collaboration to Extend and
Advance Grid Education' (ICEAGE). This project organized the
International Summer School on Grid Computing (ISSGC) and he chaired the
Programme Committee for ISSGC'06, ISSGC'07 and ISSGC'08. He is a member
of the Joint Information Systems Committee Board and JISC Support of
Research Committee. He is a representative of the UK at the
e-Infrastructure Reflection Group.
He led the development of the Department of Computing Science in Glasgow
and is now Professor of e-Science in the School of Informatics,
University of Edinburgh. He has more than 130 publications. His current
research is concerned with data integration and its exploitation. He is
currently the lead architect on an EU Framework Programme 7 project
called Advanced Data Mining and Integration Research for Europe (ADMIRE).
Hosted by:
November 2008
NCDM receives SC|08 Conference
Bandwidth Challenge Award.
AUSTIN, Texas, Nov. 20 -- SC08 -- The National Center for Data Mining (NCDM) at
UIC and the Open Cloud Consortium were awarded the 2008 SC08 Bandwidth
Challenge award at SC08 today in Austin.
Their entry was titled "Towards Global Scale Cloud Computing: Using Sector and
Sphere on the Open Cloud Testbed" and was led by Dr. Yunhong Gu of the
University of Illinois at Chicago and Dr. Robert Grossman of the University of
Illinois at Chicago and Open Data Group.
Although cloud computing is common today, processing data by clouds today is
almost always done within a single datacenter due to the technical challenges
processing data across multiple datacenters. The team today demonstrated
technology for the first time that enables cloud computing to utilize high
performance networks and spread cloud computing across datacenters to create
wide area clouds. The technology that makes this possible is the open source
Sector storage cloud and Sphere compute cloud developed by the NCDM.
NCDM used the Open Cloud Testbed, which is a testbed managed by the Open Cloud
Consortium for this challenge. The Open Cloud Consortium develops standards for
computing within clouds and frameworks for interoperating between clouds.
"A whole new generation of cloud computing is now possible using the open
source Sector storage cloud and the Sphere computing cloud and standards
developed by the Open Cloud Consortium. For the first time, developing
applications that span multiple distributed clouds is now possible," according
to Robert Grossman.
According to Joe Mambretti, director of the International Center of Advanced
Internet Research at Northwestern University and co-director of the Open Cloud
Testbed, "These innovative technologies provide unique capabilities that will
enable new generations of applications based on extremely large scale data
streams."
During the Bandwidth Challenge at SC08, the team demonstrated three
applications that used the Sector/Sphere cloud. The application transported
bioinformatics data using Sector from the conference floor in Austin to
Kitakyushu in Japan at over 8 Gb/s.
The second application demonstrated was Creditstone, which is a benchmark for
financial services applications. The Sector/Sphere implementation of
Creditstone processed about 53.5 billion synthetic credit card transaction
records in less than 1 hour.
The third application was TeraSort, which sorted 1 terabyte of data within 30
minutes. The average data moving rate was about 4.8Gb/s in the Open Cloud
Testbed, with a peak speed reaching 10Gb/s.
One of the key achievements of the Sector and Sphere software is that it is
very easy to use. For example, the TeraSort code only requires about 50 lines
of C++ code. This is critical, as it allows researchers to use their time to
focus on research problems, rather than spending time dealing with distributed
programming.
According to Yunhong Gu, "Sphere is a new software system that supports
simplified distributed data processing application development. In contrast to
traditional distributed computing methods such as MPI, Sphere allows users to
write distributed applications with a few lines of code and without knowing the
details of the underlying hardware."
Source: HPC Wire
The National Center for Data Mining
(NCDM) at the University of Illinois at
Chicago (UIC) was established in 1998 to serve as a resource for research,
standards development, and education for high performance and distributed data
mining and predictive modeling.
The NCDM is supported, in part, by the National Science Foundation, the
Chicago Bioinformatics Consortium, the Department of Defense, and the
University of Illinois at Chicago, as well as by other funding agenices
and NCDM's industrial partners.
NCDM is comprised of the Laboratory for Advanced Computing (LAC), Laboratory for Machine Learning and Data Mining (MLDM), Prof. Leland Wilkinson's group and Prof. Philp Yu's group.
Center's Recent projects:
- Teraflow Testbed - distributed infrastructure designed to use new 10 Gb/s network protocols and data services.
- Sector - infrastructure software providing distributed data storage, access and analysis/processing functionality.
- Angle - network monitoring software to detect anomalous network events across multiple monitoring sites.
- SidGrid - social informatics data collection and collaborative analysis software utilizing web and grid services.
- largedataarchive - hosts a variety of large data sets for use by the larger research community
- UDT - application level data transport protocol for the emerging distributed data intensive applications over wide area high-speed networks.
The Center focuses on three research areas:
- Scaling algorithms, applications and systems to massive data sets.
- Developing algorithms, applications, and systems for mining distributed data.
- Establishing standard languages, protocols, and services for data mining and predictive modeling.
The NCDM is a co-founding member of the Data
Mining Group (DMG), which develops the
Predictive Model Markup
Language (PMML) and related standards.
Recent News and Awards page.
Groups within NCDM
|
Prof. Leland Wilkinson's Group
|
|
Prof. Philip Yu's Group
|
|