Rethinking Bulk Data Transfers for Next-Generation Applications
Computer Science Department, Carnegie Mellon University
Tuesday, March 18th, 2008
Low (CII) 3051 - 4:00 p.m. to 5:00 p.m.
Refreshments at 3:30 p.m.
How did you use the Internet today? The answer to this question has
significantly evolved in the last decade. Ten years ago, we were browsing
simple websites with text and images, and communicating via instant
messaging and emails. In addition to these applications, today's users are
engaging in on-demand video streaming, multimedia conferencing, and sharing
files from software updates to personal music, and as a result transferring
large volumes of data (of the order of Mbytes) more frequently than ever.
Hence, bulk data transfers at the core of these applications are becoming
increasingly important and are expected to provide high throughput and
efficiency. Contrary to these expectations, however, our study of file
sharing networks confirms previous observations that bulk data transfers
are slow and inefficient, motivating the need to rethink their design.
In this talk, I will present my approach to address a prominent performance
bottleneck for these bulk data transfers: Lack of sufficient sources of
data to download from. My work addresses this challenge by (1) exploiting
network peers that serve files similar to the file being downloaded, and
(2) by coupling all the available network resources with similar data on
the local disk of a receiver. My talk will also highlight the system
design and implementation for the above solutions. For example, I will
discuss handprinting, a novel and efficient algorithmic technique to locate
the additional similar network peers with only a constant overhead.
Finally, a transfer system that simultaneously benefits from disk and
network is required to work well across a diverse range of operating
environments and scenarios resulting from varying network and disk
performance. I will present the design principles for an all-weather
transfer system that adapts to a wide spectrum of operating conditions by
monitoring resource availability.
Himabindu Pucha is currently a post-doctoral fellow in the Computer Science
Department at Carnegie Mellon University. She received her doctorate in
December 2007 and her Masters degree in 2003 from the Electrical and
Computer Engineering Department at Purdue University. Her research
interests span distributed systems, computer networks, and mobile computing.
She is an ACM Student Research Competition finalist this year and a
recipient of the Google Anita Borg Scholarship and the Purdue Violet Haas
Hosted by: Bolek Szymanski (x2714)
Administrative support: Sharon Simmons (x8291)
For more information:
Dr. Himabindu Pucha's Homepage
Last updated: March 14, 2008