From: Chandrasekar Ramachandran [cramach2@uiuc.edu] Sent: Thursday, March 13, 2008 10:37 AM To: indy@cs.uiuc.edu Subject: 525 review 03/13 Reviews: 1.AVMON: Optimal and scalable discovery of consistent availability monitoring overlays for distributed systems, R. Morales et al, ICDCS 2007 Summary: This paper presents a protocol for discovering an availability monitoring overlay in a large-scale distributed system.This protocol is unique in that it satisfies the requirements of consistency, verifiability, randomness,discoverability,load balancing and scalability in its very design.An additional design modification optimizes memory and communication bandwidth. Experimental results on traces obtained from synthetic sources, planetlab and overnet show that the average discovery time is very less, the protocol works well even in the case of frequent joins and leaves, and nearly the entire monitor set was discovered within the first 60 secs. Pros: 1. Even with growing networks, the discovery time remains scalable and nodes are discovered at regular intervals. 2. Forgetful pinging, which from figure 7, reduces useless pings to nodes which are not present in the network Cons: The choice of N for PlanetLab and Overnet Traces in the results on effects of coarse view size is not explained clearly 2. MON: ON-demand overlays for distributed system management, J. Liang et al, WORLDS 2005. Summary This paper describes the management overlay network built on the PlanetLabs testbed. By not relying on other infrastructures and by focusing on on-demand execution of management commands, this paper provides a novel and interesting way to tackle this problem. The developed protocol is lightwight and efficient.Various commands such as count,depth, topology etc are implemented. The experimental results show that the response time is nearly 97% using a two stage tree construction algorithm and that the average bandwidth used is around 490kbps. Pros 1. Low bandwidth utilization and response time for the experiments. 2. The design of MON which makes it easier to execute simultaneous commands Cons: 1. Choice of scheduling algorithm is restricted to first fit Common Thread Management overlay systems with the first paper focusing on locating and discovering and the second one on executing management commands. From: hossein.ahmadi@gmail.com on behalf of Hossein Ahmadi [hahmadi2@uiuc.edu] Sent: Thursday, March 13, 2008 9:28 AM To: indy@cs.uiuc.edu Subject: 525 review 03/13 Hossein Ahmadi (hahmadi2) =============================================================================== AVMON: Optimal and Scalable Discovery of Consistent Availability Monitoring Overlays for Distributed Systems Ramses Morales and Indranil Gupta The problem of application monitoring in distributed systems is useful for several applications like availability based replication. Authors first identify six main characteristics of such applications : 1) consistency, 2) verifiability, 3) randomness, 4) discoverability, 5) load balancing, 6)scalability. These metrics are proposed to maintain efficiency while preventing nodes from fake reports. The paper proposes AVMON, a distributed application monitoring protocol which uses hash function to build a set of consistent, verifiable, and random nodes to monitor each node in the system. Given that the availability information of nodes are collected by such group of other nodes, now the problem is how to make this information discoverable by other nodes. Moreover, there should be an scheme to discover monitors when a node joins the system. AVMON addresses both issues by maintaining a coarse view at each node where coarse view is a limited neighbor list for a coarse overlay. Authors rigorously analyze the time complexity as well as message and complexity of monitoring, and discovery operation in AVMON. They also study the resilience of AVMON against node collusion, by showing that in large scale, the probability of collusion tends to zero. Then, the paper evaluates the protocol through experiments on traces from PlanetLab and Overnet where it mainly shows how well the protocol response in terms of time when system scales as well as its response against errors. The papers approach in identifying and address all requirements of application monitoring systems, makes it valuable and solid. The analysis and evaluation are also thorough. AVMON is interesting specially because it provides collusion resilience. The only missing part in this paper is a complete comparison of its performance against current techniques via analysis or simulation. ------------------------------------------------------------------------------- MON: On-demand Overlays for Distributed System Management Jin Liang, Steven Ko, Indranil Gupta and Klara Nahrstedt This paper addresses the problem of monitoring and management of distributed applications over distributed computing systems (e.g. PlanetLab). Current approaches does not allow executing instant management commands in a distributed and scalable manner. MON uses an on demand overlay creation to eliminate complexities of maintaining an overlay for all times. MON runs at each node and is separated into three layers. The lowest layer maintains the membership information at each node through a gossip-like approach. The second layer build a tree or DAG like on-demand overlay network for status check or software push. A random tree construction is used instead of a complete coverage over all nodes since the alive nodes can not be defined very well. The third layer is instant message execution at which a status query or software push can be executed. Status query is propagated through a tree overlay and the replies are collected. However, software push is executed over a DAG to enable nodes to use more downstream bandwidth. MON is evaluation by deploying 330 nodes in PlanetLab. Authors first verify node coverage and overlay creation time in different settings. They, they study the achieved bandwidth in software push operation. The results show that MON can cover most of the nodes in a reasonable amount of time and provide appropriate bandwidth for software push. There is a tradeoff between maintaining an overlay and using an on-demand approach. The former is more efficient in stable networks while the latter can reduce the overhead in highly dynamic networks. Moreover, usually a monitoring application needs to provide availability information in time which needs a persistent overlay. Using on-demand approach makes MON suitable only for applications like status check. The other issue is that MON does not consider false status reports from nodes. Specially when nodes are organized in a tree, some intermediate node can modify and fake the status all of nodes in its subtree. =============================================================================== From: Mirko Montanari [mmontan2@uiuc.edu] Sent: Thursday, March 13, 2008 9:18 AM To: Gupta, Indranil Subject: 525 review 03/13 CS 525 - 03/13/2008 by Mirko Montanari MON: On-demand Overlays for Distributed System Management Jin Liang, Steven Ko, Indranil Gupta, Klara Nahrsted -- UIUC This paper describes a system for the management of a large number of machines. The motivation of this work comes from the problem of deploying applications in a real distributed environment (in this case PlanetLab). During deployment, new applications need to be loaded on multiple server nodes and their execution needs to be monitored. A centralized approach could led to bottleneck, so the authors propose a decentralized approach based on overlays. This decentralized approach creates a particular type of overlay for each command that needs to be executed: software push can benefit from a DAG structure, while a tree- structure is more appropriate for monitoring query (it requires aggregation). PRO: - Provides a way to manage and monitor a large number of machines at the same time efficiently - I like that the addresses the problems typically encountered in the deployment and test of applications. CONS: - The benefit of creating overlays on the flight depends on the frequency with which queries and commands are issued. In the case of a periodic query, say “get the average load of the machines every 10 seconds”, maybe keeping and maintaining and overlay is more efficient than creating a new one every 10 seconds. No results are provided to prove or disprove this. - Security is only slightly addressed even if it should be a key part in this architecture. MON allows to run arbitrary commands on a large number of machines, and it could become a target for attacks. AVMON: Optimal and Scalable Discovery of Consistent Availability Monitoring Overlays for Distributed System Ramses Morales, Indranil Gupta -- UIUC This paper proposes the construction of a monitoring overlays to monitor the availability of nodes. Differently from the MON paper, that monitors the state of each node and is able to get information such as processor load and number of processes, this paper focuses on the monitoring of node availability, checking if they are still reachable or not. A particular interest has been put in providing a system that is resilient to attack: the proposed method for selecting who monitors each node can not be compromised even if a set of nodes collude to provide fake availability information. PRO: - Using hash for determining the “monitoring - monitored” relation is interesting. - I liked the analytical analysis of the protocol CONS: - Is ping enough for guaranteeing that the node is available? ping could be substituted with some application-level availability check to assure that the application is still running. - Why don’t integrate the availability monitoring with application or overlay-routing level communications? This would allow to have monitoring “for free”, by only checking if the node is still participating in the execution of the protocols. I understand that this could reduce the “randomness”, but it might be possible to obtain protection from colluding nodes through other means… An interesting extension of this work could be the integration of availability predictors (e.g. work of James Mickens), both for providing a better estimate of the availability of nodes and also for improving the managing of the monitoring overlay itself. From: rebolledodaniel@gmail.com on behalf of Daniel Rebolledo Samper [dreboll2@uiuc.edu] Sent: Thursday, March 13, 2008 9:05 AM To: Gupta, Indranil Subject: 525 review 03/13 MON: ON-DEMAND OVERLAYS FOR DISTRIBUTED SYSTEM MANAGEMENT This article presents a lightweight system to manage other distributed systems called MON. It tracks membership through a gossip protocol: nodes periodically exchange partial views of the system (small lists of other nodes) and update their own, forgetting about the oldest nodes if need be. MON dynamically creates an overlay to perform administrative tasks like starting, querying, updating and stopping the distributed application running on the nodes. The overlay can be a can be a spanning tree of fanout k or a DAG. It can be shown that if k = Omega(log N) (N: number of nodes) then all the reachable nodes are covered with high probability. MON provides a small number of commands to query the state of the distributed system, and processes data in the network to compute aggregates. These commands are count, depth, topology (they provide information on the overlay itself), and the aggregates avg , top , histo . MON also provides a generic filter command that can be used to run any operation on the entire system. To update node software, MON first builds a DAG. It then splits the file to be transferred into small blocks and at until the transfer is complete, each node notifies its children of the pieces it currently has. Nodes can then download the pieces from multiple nodes, thereby increasing the available bandwidth dramatically. MON has many good points: it simplifies distributed system management in a completely decentralized way. The idea of using on-demand overlays is also a good one: the authors trade off a little performance (a couple of seconds for overlay construction) for a lot of simplicity in the design. Finally, they exploit locality in a simple way with the two-stage tree building algorithm. Two questions remain, however: what happens if a node fails or becomes unreachable during a query operation but before it is able to report to its parent? Maybe the administrator should be notified of this in some way. The other question is what happens if some node doesn't get a protocol update (because of a transient link failure for example) but later continues interacting with the rest of the distributed system which has been updated? The system designer should bear this risk in mind by embedding, say, a version number in his system's messages and program the nodes to discard packets from old versions. Otherwise, since MON doesn't guarantee that all nodes get updated, the system could fail in bizarre ways. AVMON: OPTIMAL AND SCALABLE DISCOVERY OF CONSISTENT AVAILABILITY MONITORING OVERLAYS FOR DISTRIBUTED SYSTEMS This article presents AVMON, a protocol designed to select availability-monitoring nodes in a distributed system. Each node monitors the availability of a certain set of nodes (the so-called Target Set or TS) and is monitored by a second set (the so-called Pinging Set or PS). Nodes may fail-stop and may also collude to artificially increase their availability by misreporting their numbers The authors present six requirements such a system should satisfy. Consistency: the PS and TS relations should be consistent; Verifiability: said relations should be easy to verify by a third party and difficult to forge or fake; Randomness: the relations should be built by selecting nodes in a uniform i.i.d. way; Discoverability: nods should be able to discover the other nodes related to them via TS or PS relatively quickly, or at least a certain number thereof; Load Balancing and Scalability. They then define the PS relation by choosing y to be in PS(x) if and only if H(Y,X) is less or equal than K/N for some K, where X and Y represent the IP addresses and port numbers of the nodes and H is a consistent (secure) hash function normalized to the interval [0,1]. This immediately provides consistency, verifiability and randomness. To discover the PS and TS relations relative to a given node, nodes keep a coarse view (CV) of the system, defined as a small set of the nodes therein. Nodes periodically exchange CVs, randomly refreshing their CVs from the old one and the one they got from a neighbor. Nodes also notify nodes u and v in their CVs when u TS v or u PS v. The protocol has very impressive asymptotic properties that depend on the asymptotic behavior of the size of the CV (cvs). The authors prove that an optimum is realized if cvs = O(N^1/4): this yields memory and per-round bandwidth costs of O(N^1/4), expected TS/PS discovery time of N^1/2 and computational overhead of O(N^1/2). Nodes join the system in O(log N) with high probability. Finally, an important point is that the system is asymptotically collusion resistant, assuming there are o(N/log N) colluding nodes, which means that the more nodes there are in the system, the more collusion-resistant it is. Unfortunately, it is easy to show that if a malicious user controls a fixed proportion of the nodes in the system, collusion is possible (though the expected effects thereof are unknown). Therefore, this system may not be suited for systems where identities are extremely cheap. From: Alejandro Gutierrez [agutie01@gmail.com] Sent: Thursday, March 13, 2008 8:50 AM To: Gupta, Indranil Subject: 525 review 03/13 =============================================================== "Distributed System Management: PlanetLab Incidents and Management Tools" Robert Adams Reviewed by: ALEJANDRO GUTIERREZ =============================================================== This paper from the Intel Corporation describes some management incidents that occurred on PlanetLab, and some technologies developed to deal with such problems. Incidents are classified into four categories: - broken hardware and driver problems; - broken infrastructure software problems; - networking problems (bandwidth problems and traffic type problems); - problems with the applications or services. The paper provides several examples for each category including resolving process, the root causes, and the lessons learned. The paper then proceeds to describe some tools that have been developed specifically for PlanetLab management. These tools help: - restore nodes remotely; - trace traffic flow; - propagate incident report more efficiently and directly; - control resources reactively/proactively I think this paper presents a collection of failures that have occurred in the PlanetLab. Although it is dated from 2005 I find it interesting to read about the technologies that the researchers at the Intel Corporation had to develop based on the lessons learned from real life scenarios. This is overall where the quality of this paper rises, because it doesn’t present a simulation, it presents real responses to real problems. I find it particularly remarkable how the author suggests a number of design implications for future distributed management tools. =============================================================== =============================================================== "MON: On-demand Overlays for Distributed System Management" Jin Liang, Steven Ko, Indranil Gupta and Klara Nahrstedt Reviewed by: ALEJANDRO GUTIERREZ =============================================================== This paper describes a Management Overlay Network (MON) tool developed in the Department of Computer Science at the University of Illinois at Urbana Champaign University. Characteristics of MON include: - Designed for large-scale distributed systems. - allows users to execute instant management commands for push based operations such as querying application status. For the previous operations DAG is used and after analyzing the results presented in the paper I think MON achieves high bandwidth. - allows users to execute instant management commands for pull-based operations such as distributing application updates. For the previous operations a tree overlay structure is used and after analyzing the results presented in the paper I think MON achieves low command response time. - takes a stateless approach to building on-demand an overlay structure dynamically for each management session. - Two algorithms for tree construction are proposed: random and two stage (where the latter improves the former by adding locality. The DAG construction algorithm is a modified version of the above algorithms for tree construction. - Membership of MON servers is maintained using a gossip-style protocol. I like that the authors allow users instant management commands because it makes the tool easy to use from the user side. Efficiency is obtained by tuning DAG, allowing a node to download from multiple parents. I personally think MON is not scalable because I wonder what will happen if many overlays are constructed at the same time. My guess would be that MON would cause a significant large overhead to the network. Details about the algorithms implemented are not clearly presented and it is not at all clear how the data aggregation is done in MON. When considering push-based operations, a significant traffic overhead can be produced for large files because DAG notifies its children about a new block whenever it receives one. From: fariba.mahboobe.khan@gmail.com on behalf of Fariba Khan [fkhan2@uiuc.edu] Sent: Thursday, March 13, 2008 8:17 AM To: indy@cs.uiuc.edu Subject: 525 review 03/13 Distributed system management: PlanetLab incidents and management tools, R. Adams, PlanetLab Techreport This report summarizes the incidents in PlanetLab from Nov 2002 to June 2003 that required administrative involvement to solve. PlanetLab is a distributed global testbed for testing new network applications and accessing network wide services for measurements. An application is hosted in a virtual machine on the nodes. The application on set of nodes is called a slice and on a particular node is a sliver. PlanetLab is managed by a small set of administrators. Each node also has a local PI and site administrator who typically might or might not bethe network admin for that sight. Apps on PlanetLab are experimenting with new protocols, trying to understand facets of Internet. Lot of time the use of network by theses apps are stressing the limits and questioning assumptions, going far beyond traditional. Incidents in PlanetLab are in a sense similar to what we see in any network. Hardware failure, software bug, old software, mis-configuration – causing crashes, excessive traffic, inappropriate traffic and attacks. The problem is these are caused by an app (sliver / slice) from a user in site A, run on a machine's on site B(s) and managed by planetlab admins in site C. It takes days to even get the email from the bank that is supposedly being attacked by a node to appropriate people on multiple sites. A lot of the problems cannot be handled off-site and then all these personnel need to collaborate to fix it locally. Some of the problems I assume have been solved now. To start with virtual machines now have much separation of resources and usually have a management VM too. So an app eating up all memory or bandwidth should be less of a problem now. (Not non-existence though bugs will always be there) The email chain is something that needs to get set up. They mentioned an webpage with each node. But I have my doubts about that being used. Any comments? Is it being used? In few cases I also felt that the purpose of planetary testbed is being compromised by putting too many do's and dont's. If I want to test the Internet if I cans transfer files at gigabit rate what else can I use if PlanetLab restricts my bandwidth (EmuLab, Illiac) MON: ON-demand overlays for distributed system management , J. Liang et al, WORLDS 2005. MON proposes a light-weight overlay network for management of nodes on Planetlab. Overlay is onstructed on-demand either as a tree or a DAG. Tree is suitable for status queries and DAG for software update. Status query can ask for resource utilization like top for all nodes and also has support for in-network aggregation. Software update uses MON to distribute update in a distributed fashion – less dependence on a central repository. These are two problems that came out of the tech report and very good solutions provided by MON. MON's motivation is very interesting – instant management commands. Pull up a terminal and write out commands as to PlaneLab nodes as if they are on your network and you are the sys-admin. But to me it was not clear if that was achieved. I would say the intro is misleading. -- Fariba Khan 217-778-3922 PhD candidate Illinois Security Lab University of Illinois From: Riccardo Crepaldi [crepric@gmail.com] Sent: Thursday, March 13, 2008 2:12 AM To: Gupta, Indranil Subject: 525 review 03/13 MON: On-demand Overlays for Distributed System Management This paper presents MON, a distributed management overlay network system. The system provides an easy and efficient way to execute queries and commands in parallel on several clients in a distributed system. The architecture is maintained bye a server running on any single machine. The hierarchical structure among nodes is created on-demand, when a command or a query needs to be executed. The server has a three-layered layout, that manage the membership, the overlay creation and the communications. In the paper the authors show how two different structures, a tree and a DAG, can be created among nodes, on-the-fly, to satisfy the different requirements of a query or a software push. The system is said to be running on more than 300 machines on PlanetLab, and to be able to return an answer to a query in less than 2 seconds. The definition of the query syntax is interesting, and it can be easily extended. The software push component is said to be not mature yet. The approach of building an on-demand structure every time we need to execute a query is lightweight and deals very well with the dynamic network configuration due to variable availability of nodes. The final goal is achieved, users can run queries and command on all their PlanetLab machines, in a distributed way. However the software push experiment shows how in fact that part of the project was not yet completely optimized (20 nodes in the experiment can probably be handled with a central approach). Additionally, it is not clear how much a effort should be put to extend the query capabilities with new (more complex) syntax and maybe data structures. Finally, in the whole paper there’s no mention of what happens if more than one command is executed at the same time, or frequent enough. It seems that the overlay will be recreated every time, with a high overhead that maybe could be avoided if the membership could be kept for a certain time. There is no mention to the overhead in the paper. Even if it is true that the gossip protocol keeps it low, it might be still relevant if the query frequency is high enough. AVMON: Optimal and Scalable Discovery of Consistent Availability Monitoring Overlays for Distributed Systems This second paper presents AVMON, another monitoring overlay that is scalable and manages well the availability monitoring problem in large, distributed networks. A big emphasis in the paper is given to the fact that the protocol intend to be robust to malicious nodes. The membership approach not only prevents a node to be selfish (i.e. advertising himself with better performance) but also detects colluding nodes, that collaborate in order to increase, maliciously, their availability level. The protocol is actually very simple and the paper describes it very quickly. The membership, done using a hashing function, is consistent, verifiable and random. Each node has a set of nodes that is monitoring it, and a set of nodes that it is monitoring. Due to the hashing function, that is known by the whole network, any third node can verify this relationship. The paper defines six goals that were taken into account when designing this algorithm. These goals are Consistency, Verifiability, Randomness, Discoverability, Load Balancing, Scalability. The authors claim several times that this is the first algorithm that addresses all these goals at the same time. A large portion of the paper address the definition and proof of the most important properties of the algorithm, and in the end the evaluation section provides simulations results. I thing the main contribution of this paper is that of providing a really simple-designed solution for an important task like monitoring availability of nodes in a distributed system, and dealing with the presence of malicious nodes. This approach can become very useful as a service for many applications that need information about the availability to behave in an optimized way, and gurantees reliable results. However the paper does not really consider how this approach compares with similar protocols, like those presented in the related work section. From: Qiyan Wang [qwang26@uiuc.edu] Sent: Thursday, March 13, 2008 1:13 AM To: Gupta, Indranil Subject: CS525 review 3/13 MON: On-demand Overlays for Distributed System Management MON builds on-demand overlay structures that allow users to execute instant management commands, including query the current status of the application and push software updates to all the nodes. The advantages of MON is light-weight, requiring minimum amount of resources when no commands are executed. Maintaining an overlay structure for a long time is difficult, due to various kinds of failures that can occur in a dynamic system. The on-demand approach also frees MON from complex failure repair mechanisms, and no overlay structure is maintained for a prolonged time. In each management session, a fresh overlay structure is generated dynamically. The experiment of MON running on the PlanetLab show that MON has good performance in terms of command response time and achieved bandwidth of software push. Some criticisms: In order to have quick and efficient overlay construction mechanism, MON tradeoffs the node coverage by using random approach to construct overlay tree of DAG. As a result, the established tree may not cover all nodes in the system. However, some commands that users may need are dependent on the overall information of system, such as finding out the server with maximal available resource. Then, the returned output may be incorrect with some probability. AVMON: Optimal and Scalable Discovery of Consistent Availability Monitoring Overlays for Distributed Systems Authors address the problem of selection and discovery of a consistent availability monitoring overlay for computer hosts in a large-scale distributed application, where hosts may be selfish or colluding. They build AVMON system to satisfy the six properties of consistency, verifiability, randomness, discoverability, load-balance, and scalability. They make an algorithmic contribution of a protocol for discovering the availability monitoring overlay in a scalable and efficient manner, given any arbitrary monitor selection scheme that is consistent and verifiable. The AVMON system leverages the consistent hash-based pinging set selection, and also includes practical optimizations for the algorithms to address high-churn systems. The evaluation experiments show that AVMON performed well under three synthetic churn models, (including static, join-leave, and join-leave-birth-death), and availability traces from PlanetLab and Overnet. From: dkassa2@uiuc.edu Sent: Thursday, March 13, 2008 1:01 AM To: indy@cs.uiuc.edu Subject: 525 review 03/12 ============================================================================================== Review 11: Paper Title: AVMON: Optimal and Scalable Discovery of Consistent Availability Monitoring Overlays for Distributed Systems The paper first specifies six design goals to address the challenges of availability monitoring services for computer hosts in a large-scale distributed systems. The paper argues that existing availability monitoring schemes in literature fail to address the specified necessary design goals. The paper then presents AVMON, the first complete system for selection and discovery of an availability monitoring overlay which satisfies the six design goals of consistency, verifiability, randomness, discoverability, load-balancing and scalability. AVMON comprises of two algorithms: (a) a distributed, efficient, scalable, and load balanced algorithm for discovery of monitors according to any consistent and verifiable selection scheme, and (b) derivation of an optimal variant of this discovery protocol, in order to optimize memory and communication bandwidth, discovery time and computational complexity. AVMON is presented in detail and numerical results are given to verify the results. I am not convinced that the paper makes any significant or specially clever contribution in distributed systems. Apart from this I have no major negative comments against the paper. =================================================================================================== Review 12: Paper Title: MON: On-demand Overlays for Distributed System Management The paper presents the design and preliminary evaluation of MON, a management overlay network designed for large distributed applications. Unlike existing tools, MON focuses on the ability of a user to execute instant management commands. MON does not rely on other infrastructures such as DHTs. MON builds on-demand overlay structures for status query and software push. Hence MON is lightweight, simple and failure resilient. The fact that MON is on-demand frees it from complex failure recovery mechanisms as no overlay structure is maintained for a long time. The management overlay network (MON) system which is tested on the PlanetLab facilitates the management of large distributed applications by allowing users to execute instant management commands pertaining to their applications. For scalability, MON adopts a distributed management approach. An overlay structure (e.g., a spanning tree) is used for propagating the commands to all the nodes, and for aggregating the results back. Numerical results are given to verify the performance of MON. Some weaknesses and possible extensions of MON are also discussed in the paper. I have no major negative comment against this paper. =========================================================================================================== From: Zixia Huang [zixia.huang@gmail.com] Sent: Wednesday, March 12, 2008 11:23 PM To: Gupta, Indranil Cc: Zixia Huang Subject: 525 review 03/13 Paper Title: Distributed System Management: PlanetLab incidents and management tools Author: Robert Adams Summary: I have been using PlanetLab for nearly a year and I really appreciate the research society that could provide such a good environment for me to do the experiement for overlay network. This file deepens my understanding of how PlanetLab work and what problems that they previously met. It can mainly be divided into two parts: incidents and tool development. Some typical incidents include broken hardware, broken software, network issues (excessive bandwidth and inappropriate traffic), as well as application services. Tools developement includes how to turn down nodes back to the service, traffic tracing, disintermediation as well as resource control. This paper is very useful for me to understand how PlanetLab functions. Paper Title: MON: On-demand overlays for distributed system management Author: Jin Liang, Steven Ko, Indranil Gupta and Klara Nahrstedt Summary: The main contribution of this paper is to present a Management Overlay Network (MON) system designed to facilitate the management of large distributed network. Because of its on-demand characteristics, MON is very light-weight, and only minimum resources are needed. The architecture of MON can be divided into three layers: distributed system management, overlay construction and membership management. From: Rahul Malik [rmalik4@uiuc.edu] Sent: Wednesday, March 12, 2008 10:59 PM To: Gupta, Indranil Subject: 525 review 03/13 MON: On-demand Overlays for Distributed System Management SUMMARY: In this paper, authors have presented a management overlay network in order to facilitate the management of large scale distributed applications. The main advantage of their system is that their approach is on-demand. As a result, they do not need to maintain their system. Their system provides functionalities for pushing the application code on the nodes and then starting their applications as well as querying the status of the applications. It also follows a distributed approach as opposed to centralized one. The system consists of a daemon process running on each node of the distributed system. It consists of three layers. The top layer is responsible for distributed membership management. Each node maintains a partial view of the system and nodes constantly exchange information through ping-pong messages. They associate each entry with age. The middle layer is responsible for on-demand overlay construction. They make two kings of overlays: trees and DAGs. In order to take the locality into consideration, they introduce a new algorithm called two stage. Finally, the lowest layer is responsible for instant command execution. They have various kinds of commands. They have tested their system on Planetlab nodes. PROS: There are several good features about the algorithm. It is distributed architecture. As a result, there is no load on one central node. Also, the construction of overlay in on-demand. As a result, there is not heavy cost of maintaining the overlay. CONS: There are several weaknesses of the approach that have not been addressed in the paper. Suppose a node sends a query message that is passed through the tree. After passing the query, if the node fails, then all the data in that subtree is lost. They have not addressed that problem. Also, they have not mentioned in their two stage approach for tree construction that when should a node start choosing from its local list. An analysis for this is missing. Also, the approach for making the DAG is also very ad-hoc. They should do mare analysis as to how many number of parents in average will a node have and so-on. FUTURE WORK: They should do more analysis of the system for node failures. There is no data for that analysis. Also, they should test the software push on larger number of nodes. Distributed System Management: PlanetLab Incidents and Management Tools SUMMARY: In this draft, the author has mainly mentioned the various incidents that took place in the early stages of PlanetLab and the management tools that have been built since then. First of all, they describe the overall operation of PlanetLab. It consists of a collection of distributed computers distributed around the Internet called nodes. Each application sees the node as a private Linux computer and can perform various kernel level operations on it. A slice is distributed over the nodes in the form of slivers and it runs on individual node. Each site has a principal investigator who is responsible for that site. They have mentioned various incidents that have occurred and the lessons that were learned from them. The incidents ranged from broken hardware and software to network problems such as excessive bandwidth to inappropriate traffic to problems in applications and services. In order to deal with some of these problems, they have developed some tools that have been mentioned in the paper. They are mainly divided into four categories: down nodes back in service, tracing traffic, disinter meditation and controlling resources. PROS: In this draft, they have pointed out various problems that have occurred and the lessons learnt from them. They have developed tools in order to deal with some of these problems. This gives a good research direction and many open research problems in distributed systems, distributed resource management and so on. CONS: One of the major drawbacks is that the paper is very old, written in 2003. PlanetLab is a very rapidly evolving platform and a lot has changed since then. So, not much can be said about its status by looking at the current paper. Also, they have mainly discussed problems from the point of view of an administrator but not from the point of view of a user of PlanetLab who has to run various applications on distributed nodes of it. How can a user deploy them, aggregate data, etc. Also, they should describe system architecture in more detail. From: Hengzhi Zhong [hzhong@uiuc.edu] Sent: Wednesday, March 12, 2008 4:30 PM To: Gupta, Indranil Subject: 525 review 03/13 Distributed System Management: PlanetLab Incidents and Management Tools Summary: This paper talks about the usage experiences on PlanetLab, an open, distributed text bed for various distributed applications and services. In October 2003, there were 200 nodes at 90 different sites. A central small support team is there to maintain the nodes. There are also local support teams on each site. This paper discusses various incidents happened while using the PlanetLab and the lessons learned. The incident categories can be broken down to hardware and driver problems, software problems, networking problems, and user-application problems. These incidents suggest a more thorough testing of hardware and software, cautious use of software, limited allocation of resources (such as bandwidth, disk space, etc), and better reactive measures when user-applications create problems. Various tools are discussed to restore nodes, monitor traffic, and on detect and repair problems. Traffic monitoring is really important. A tool for tracking information on which IP addresses the processes were sending and receiving from was developed. Then a tool analyzes this information and calculates the connections from a source to a destination by some slice. This information is then sent to the central support group. So, instead of analyzing packets, this seems to be more appropriate. Different sites have different restrictions and policies. PlanetLab should group sites with similar restrictions together, helping users to distribute their applications on appropriate sites. It is possible to create an application to snoop around on the PlanetLab. The application can snoop around other applications to see what they are doing, what data they are using, etc. Are there any security measures against that? MON: On-demand Overlays for Distributed System Management Summary: MON is a distributed management overlay network for the PlanetLab. It builds an on-demand overlay for each user management command. This approach is light-weight. The MON system has a MON server running on each node. Each MON server has 3 layers, responsible for membership management, creating overlays on-demand using the membership information, and propagating commands and aggregating results on the overlay. The membership management is gossip-style. MON uses a tree structure for status queries and a DAG structure for software push. There are two tree construction algorithms. In random tree construction, a software sends a message to a nearby MON server. The server then forwards the messages to k nodes. Another construction is two stage, which is locality aware. The first stage of tree construction is the random tree construction. On the second stage, each node first selects its local neighbors first. The trees can be converted to DAGs. Pros: 1. on-demand approach for management command makes it light-weight and simple. 2. allows users to manage their own applications on PlanetLab Cons: 1. there can be many overlays at once if many users all execute commands at the same time. 2. different queries use different overlay structures. So, for a new type of queries, does that mean a new structure must be proposed? 3. for a software push, the experiment has up to 20 nodes for that. However, what about a bigger number of nodes? What’s the scale? 4. gossip-style membership management may incur message overhead. From: emenese2@uiuc.edu Sent: Wednesday, March 12, 2008 10:13 AM To: Gupta, Indranil Subject: 525 review 03/13 Paper: Mon: On-demand Overlays for Distributed System Management Reviewer: Esteban Meneses This paper presents a technique for dealing with the problem of distributed system management by means of “on the fly” overlay creation and a set of query commands. The basic idea is that maintaining overlay information is costly and error prone, while creating an overlay in an on demand basis will generate overlays that consider the current conditions of the network and that don't require much effort to maintain for the short term they will be used. The MON architecture consists of three layers: distributed system management, overlay construction and membership management. MON uses a gossip-style for keeping track of membership management. Each node maintains a partial list of the nodes in the system. Ping-pong mechanism helps node to keep track of new nodes as well as of failed ones. Once an overlay is requested, then either a tree or a DAG is built. The tree is intended for an instant status query, while the DAG is more apt for a software push. The reason for having a DAG as primary structure for a software push resides in the fact that a tree precludes a node to download the software faster than its parent rate and also it is less failure resilient. The overlay is created by a “session” message sent by one node to a MON server. This last server will contact k members of its partial list with the session message to integrate the overlay. However, to include locality in the formation of the overlay, a node also holds a local list with nodes in its vicinity. In the formation of the overlay, such nodes will also be considered to integrate the overlay. The main advantage of MON resides in the fact that no overlay information is kept and maintained in the nodes while such structure is not needed. However, the disadvantage lies in the requirement that the overlay construction must be performed every time it is demanded, thus incurring in an overhead. The idea of on-demand overlays is interesting as for certain applications much effort is invested in maintaining an overlay that might be used infrequently. Nevertheless, it is necessary to keep the membership information up to date, otherwise the overlay formation will end up in a very flaky structure. On the other side, I wonder if this scheme can be extended to include “network behavior” in consideration. I mean, according to the state of the network, we can kept certain information or let it be built every time. For highly dynamic networks with a high churn rate, we should kept as less information as possible, given that this information will be mostly out of date. However, for more stable networks, perhaps keeping information (rather than generating it every time) will be more valuable. I didn't see any measure experiment in the paper about what is the overhead of building an overlay every time. Perhaps the queries are frequent and the overlay is being built once and again when it would be more cheap to maintain an overlay structure. Moreover, there was no comparison between MON and other tools designed for the same purposes. Paper: Distributed System Management: PlanetLab Incidents and Management Tools Reviewer: Esteban Meneses This paper presents a huge list of technical problems that the central team of PlanetLab had during the first year of operation. Although the layout of the paper is very different from the common systems research paper, there are several interesting observations, plus many obvious ones. PlanetLab is an open, global and distributed test bed for researches around the world to develop, deploy, test and experiment with their distributed systems initiatives. The infrastructure is conformed by many institutions in the world that decided to donate several servers in order to be part of the PlanetLab. Each individual computer is named to be a node and it runs a special Linux version that permits to have virtual servers for each running application. The idea behind to separate applications in virtual servers comes with the fact that isolation offers a good chance to study the behavior of a particular application. Thus, every program believes that the computer is dedicated only to itself. Now, when an application is running it is called a slice that runs across many distributed nodes. A sliver is the part of a slice that is run in a particular node. The main characteristics of PlanetLab are that the applications are not controlled by a certified authority and that the research work will affect the local network sites as well as the central Planet Lab committee that is in charge of keeping PlanetLab running. The rest of the paper describes incidents in broken hardware, malfunctioning software, network problems and application misbehaviors. PlanetLab has many advantages as a testbed, because it permits to have a real sense (more and less) of how the application can run in the real Internet. However, many institutions in PlanetLab are connected by the specialized Internet2 backbone, which precludes the prior conclusion to hold. Nevertheless, PlanetLab is still a good thermometer for several issues of networking. Furthermore, PlanetLab incorporates issues about overlays and it is itself a more failure resilient backbone. Extending the RON idea of having several specialized sites to route across the Internet, PlanetLab can execute programs in the same fashion on top of the overlay. Perhaps one disadvantage of PlanetLab resides in the fact that the control is centralized in one authority. I wonder whether the management tools developed by the PlanetLab team can be used in a more dynamic environment. From the plots in the paper, it is quite clear that the overlay doesn't have a lot of churn. Will these tools work in a more complex setup? Is it possible to decentralize the control in such a way that it can scale better? Also, it should be interesting to have a more flexible PlanetLab where few nodes were maintained by the central group and others by the local groups. From: Anthony Cozzie [acozzie@gmail.com] Sent: Monday, March 10, 2008 8:23 PM To: Gupta, Indranil Subject: 525 Review 3/11 I'm going with the two MON papers in the hope of another easy set of reviews ;) MON Managing large numbers of computers can be very cumbersome; simple tasks like SSH'ing in and starting a process suddenly become very tedious when 500 machines are involved. For example, Google has gone to a great deal of effort to make their large clusters of machines easily usable, with tools like Borg, GFS, MapReduce, BigTable, and so on. However, all of these solutions use a central server and a (relatively) high performace enviroment (a server farm). MON attempts to do the same sort of thing in a distributed, reliable manner over the internet. Since MON is able to achieve better load balancing than a central server, I think it is basically superior (although obviously it does much less than the Google tools mentioned above). OTOH, none of the results are particularly surprising or novel. AVMON After perusing this paper it appears I was duped, and while both MON's in the name refer to monitoring, the papers are quite different. AVMON attempts to build a theoretically optimal availability monitor (low traffic and collusion resistant). The protocol is fairly simple: when a node joins it is assigned to CVS (presumably sqrt(N)) nodes, and then each round each node shufffles its CVS to keep it random. Each node in the CVS is then monitored while it monitors a set of nodes based on a hash function. The paper seems a little weird: it's almost like one monitoring scheme on top of another. I do not like the collusion resilience paragraph (4.3), specifically the derivation (1- (CK)/N) -> 1 as N -> infinity, or the similar derivation in the next paragraph. The bound of O(N/log(n)) is actually relatively low (say 5% for a system of 1 million nodes) and while admittedly (1-K/log(n)) does go to 1 as N goes to infinity, the logarithmic function grows so slowly that in practice the probability that a colluder appears is quite reasonable. For example, in a system of 1 billion nodes, there is an O(3%) chance that a node will be monitored by a malicious node. The second paragraph is even worse, because the "D total colluding relationships" implies only sqrt(D) malicious nodes if all the malicious nodes are willing to work together. For example, if our system has 1 million nodes, sqrt(D) is order 1000/20 = 50 -> a very small fraction. Of course, maybe I just don't understand the mathematics. anthony