|
|
|
|
|
|
|
|
|
|
|
|
| Research interests |
Distributed systems, content distribution services, system performance diagnosis and control, wireless sensor networks |
|
|
| Research projects |
This project investigates the feasibility, cost, and potential impact of
bounded-delay content distribution networks.
A PlanetLab prototype is used
to experiment with content placement and delivery subject to end-to-end
delay bounds.
We designed a distributed algorithm to dynamically select
a subset of servers of a CDN to host the content such that a global
delay bound is guaranteed. Evaluation results drawn from our
implementation and deployment demonstrate that despite Internet delay
variability, subsecond delay bounds can be guaranteed with a very high
probability at only a moderate content replication cost.
We further investigated a load-balancing issue in the system which
necessitates considering servers' workload conditions and the
popularities of content objects when deciding on replication
strategies.
A generalized schedulability-maximizing problem was then formulated and
investigated for large-scale distributed systems. We explored a
cluster of resource allocation policies, and identified promising
candidates. Important insights were provided into the problem
for subsequent analysis.
Distributed systems continue to grow in scale and complexity,
resulting in increasingly more involved interactions
among components and increasingly more intricate failure
modes that are very hard to diagnose manually. This increased
vulnerability of larger systems, together with the
increased difficulty of failure diagnosis, has motivated machine
learning approaches to automate the diagnosis task.
While preliminary encouraging results are achieved, scaling up
the existing approaches to large applications remains
challenging. With increase in scale, current approaches
suffer the curse of dimensionality exacerbated by the exploding set
of system states and measured metrics. In this project,
we present techniques that significantly improve scalability of
automated performance diagnosis methods.
We developed various protocols, including localization,
multi-frequency MAC, and distributed storage systems for
large-scale wireless
sensor networks in which individual nodes subject to severe resource
constraints.
Production run software failures cause endless grief to
end-users, and endless challenges to programmers as
they commonly have incomplete information about the
bug, facing great hurdles to reproduce it. Users are often
unable or unwilling to provide diagnostic information due
to technical challenges and privacy concerns;
even if the information is available, failure analysis is
time-consuming.
We propose performing initial diagnosis automatically
and at the end user?s site. The moment of failure is a
valuable commodity programmers strive to reproduce --
leveraging it directly reduces diagnosis effort while
simultaneously addressing privacy concerns.
Additionally, we propose a failure diagnosis protocol.
So far as we know, this is the first such automatic proto-
col proposed for online diagnosis. By mimicing the steps
a human programmer follows dissecting a failure, we deduce
important failure information. Beyond online use,
this can also reduce the effort of in-house testing.
|
|
|
- Joseph Tucek, Shan Lu, Chengdu Huang, Spiros Xanthos, and Yuanyuan Zhou
Triage: Diagnosing Production Run Failures at the User's Site
The 21st ACM Symposium on Operating Systems Principles (SOSP'07), October 2007, to appear.
- Chengdu Huang, Tarek Abdelzaher, and Xue Liu
On Dominating Set Allocation Policies in Real-Time Wide-Area Distributed Systems
The 19th Euromicro Conference on Real-Time Systems (ECRTS 07), to appear.
- Chengdu Huang, Ira Cohen, Julie Symons, and Tarek Abdelzaher
Achieving Scable Autoamted Diagnosis of Distributed Systems Performance Problems
HP Labs Tech Report, HPL-2006-160.
- Liqian Luo, Chengdu Huang, Tarek Abdelzaher, John A. Stankovic, and Xue Liu
EnviroStore: A Cooperative Storage System for Disconnected Operation in Sensor Networks
IEEE INFOCOM 2007, to appear.
- Jingbin Zhang, Gang Zhou, Chengdu Huang, Sang H. Son,
and John A. Stankovic
TMMAC: An Energy Efficient
Multi-Channel MAC Protocol for Ad Hoc Networks
IEEE International Conference on Communications (ICC 2007), to appear.
- Joseph Tucek, James Newsome, Shan Lu, Chengdu Huang,
Spiros Xanthos, David Brumley, Yuanyuan Zhou and Dawn Song
Sweeper: A Lightweight End-to-end System for
Defending Against Fast Worms
The 2nd ACM SIGOPS EuroSys (EuroSys'07), to appear.
- Tarek F. Abdelzaher and Chengdu Huang
End System Quality of Service
Book Chapter, The Handbook of Computer Networks, John Wiley and Sons, 2007
- Gang Zhou, Chengdu Huang, Ting Yan, Tian He, John A. Stankovic and
Tarek F. Abdelzaher
PDF
MMSN: Multi-Frequency Media Access Control for Wireless Sensor
Networks
IEEE INFOCOM 2006, Barcelona, Spain, to appear.
- Chengdu Huang, Gang Zhou, Tarek F. Abdelzaher, Sang H. Son, and John A. Stankovic
PDF
Load-Balancing in Bounded-Latency Content Distribution
The 26th IEEE International Real-Time Systems Symposium (RTSS 2005),
December 2005.
- Chengdu Huang and Tarek F. Abdelzaher
PDF
Bounded-Latency Content Distribution: Feasibility and Evaluation
IEEE Transaction on Computers, November 2005
- Chengdu Huang, Seejo Sebatine, and Tarek F. Abdelzaher
PDF
Design, Implementation and Evaluation of a Real-Time Active Content Distribution Service
Real-Time Systems Journal, Springer, 2005
- Tian He, Chengdu Huang,
Brian M. Blum, John A. Stankovic, and Tarek F. Abdelzaher
PDF
Range-free Localization and Its Impact on Large Scale Sensor Networks
ACM Transaction on Embedded Computing System, 2005
- Tarek F. Abdelzaher and Chengdu Huang
Security and Web Quality of Service
Book Chapter, The Handbook of Information Security, John Wiley and Sons, 2005
- Chengdu Huang and Tarek F. Abdelzaher
PDF
Towards Content Distribution Networks with Latency Guarantees
The Twelfth IEEE International Workshop on Quality of Service
(IWQoS 2004)
- Chengdu Huang, Seejo Sebastine, and Tarek F. Abdelzaher
PDF
An Architecture for Real-Time Active Content Distribution
The 16th Euromicro Conference on Real-Time Systems (ECRTS 04)
- Tian He, Chengdu Huang, Brian M. Blum, John A. Stankovic, and Tarek F.
Abdelzaher,
PDF
Range-free Localization Schemes in Large Scale Sensor Networks
The Ninth Annual International Conference on Mobile Computing
and Networking (MobiCom 2003)
|
|
|
|
|
|
|
[HTML]
[CSS]
Last updated: July 30 2007
|