From: Mike Earnhart [mearnhart@gmail.com] Sent: Thursday, February 23, 2006 8:30 AM To: Indranil Gupta Subject: 598ig review 02/23 Michael Earnhart CS598 IG February 23, 2006 Summary The End to End Argument and Preserving Peer Replicas By Rate-Limited Sampled Voting In The End to End argument the primary topic is the design of software programs and more specifically those that require a communication subsystem. They claim choosing the boundaries between functions is the primary activity of a system designer. And if it is the primary activity then this topic should be discussed and analyzed which is what they do. The first conclusion they reach that is significant is that if a data communication system to go out of its way to provide reliable data transfer it does not relieve the entire burden from the application is reliability is an absolute requirement. They are emphasizing that the only places guarantees can be made are at the application end points. Therefore they claim lower levels need not provide "perfect" reliability. This is a large step away from what the current transport protocol TCP attempts. However due to the nature of software it is impossible to provide a general rule that would be applicable to all software. They conclude with several examples where end-to-end design would aid significantly to the reliability of a system where the low level mechanisms where insufficient. The importance of this paper is admitting that low level reliability will inherently be incomplete due to the lack of knowledge and/or capabilities at those levels. Also it is importance because it clearly indicates that low level reliability must be in place for reasonable correction of frequent errors. They give the example of file transfers and if only end to end design were utilized every bit error would require a complete retransmission of the file or a portion of it where as if it were detected at a lower layer a single packed could be retransmitted. Therefore they are wise enough to note that end to end has significant limitation but still maintain that it is the only way to assure "perfect" reliability. The problem that seems to stick out is the ambiguity of when end to end argument should be applied versus to what degree of low level reliability should be implemented. This paper essentially states trust nothing lower than the application layer. This provides a nearly impossible programming environment. In Preserving Peer Replicas By Rate-Limited Sampled Voting the problem of long term reliable data storage is approached in an extremely conservative manner. They have significantly difficult design principles such as: the use of non-reliable storage device, no long term secrets such as encryption keys, no third party trusts, and arbitrarily strong adversaries. They essentially utilized the concept of inertia to maintain the stability of the storage. They then describe in significant detail their opinion poll protocol and the mechanisms used to maintain it's stability and reliability. Finally they identify all the potential capabilities of an adversary and the types of attacks they might enact. Following that discussion of adversaries they run simulations on Nares to show their system works. Finally their results are discussed which are generally amazing and future work is presented. The overwhelming strength of this paper is that this distributed system is nearly autonomous and makes minimal assumptions about future computing such that it requires no intervention of any 3rd party trust, and relies on no encryption key or secret keeping. This seems to be a fairly strong requirement and provides a new level of reliability. However the use of rate limiting is also very interesting and clearly aids in maintaining stability in such a large and long running system. Again the magnitude of this system such that many metrics are measured in years and compromised nodes in the 10's of percentages is very impressive but the costs and performance is somewhat glossed over. The issues are simply that costs and performance is glossed over. Towards the end the paper states, "We have shown that a combination of massive replication, rate limitation, inherent intrusion detection and costly operation can produce a peer-to-peer system with remarkable ability to resist attacks...". This obvious and so likewise many things in this world provided unlimited resource they can do significant deeds. They seem to be doing a significant amount of brute force to avoid or alleviate the nearly impossible problem of unbounded adversary capabilities. I am not satisfied with out some explanation of require resources and or performance marks. From: Maifi Khan [maifi.khan@gmail.com] Sent: Thursday, February 23, 2006 8:10 AM To: Indranil Gupta Subject: 598ig review 02/23 Title: Preserving peer replicas by rate-limited sampled voting. Summary: This paper proposes a new design for managing replica in a peer-to-peer environment that is scalable and will provide security and reliability over decades. There previous work, LOCKSS (Lots of copies keep stuff safe) was designed to facilitate sharing journal documents among libraries using low cost infrastructure. But it suffers from scalability and security problems. In there new design, a LOCKSS peer call opinion poll on the contents of an AU (archival unit) to detect damage of the contents. After receiving votes from a subset of peers it will detect if the votes agree with its own copy. If agree (landslide win), it decides the copy is fine otherwise in case of disagreement (landslide loss) it gets a copy from a peer who disagreed. After repairing it polls again. If it cannot decide whether win or loss, it raises alarm and manual repair is needed. Poll participants are divided in inner circle class whose vote decides the decision and outer circle class who is used for peer discovery. Discussion: They use rate limiting technique to limit the progress of an attacker. So the rate at which a loyal peer polls is limited. It also implies detection of a damage by a loyal peer will also be delayed. They have not implemented the new design in real life scenario which may affect on various aspects of the design. Lack of stable identities make it vulnerable to session hijacking attack. The protocols performance is not good against attrition adversary. Title: End-to-End arguments in system design. Summary: This paper talks about a design principle about placing functions while designing a system. Their argument is that placing a function always a low level is sometimes inefficient and can be redundant also. They explains this design principle giving some examples as providing encryption at low level, duplicate message suppression etc. There are some functions that can be completely and correctly implemented with the knowledge of the application at the endpoints. Fro example, if we use packet retransmission at router, it may be worse for real time application such as real time streaming video where delayed packets are worse than lost packet. Awareness about end to end argument can prevent designers designing layered systems and reduce the temptation to implement functionality at lower layers that may be redundant or wasteful. Discussion: Many times when designing protocol, it is hard to anticipate what kind of applications are going to use it. So a design that may adhere to end to end argument today may not tomorrow. This why it is really hard to claim that a design is going to be efficient forever. With the changing system specification the design principles also changes. From: Muyuan Wang [mwang2@uiuc.edu] Sent: Thursday, February 23, 2006 8:04 AM To: Indranil Gupta Subject: 598ig review 02/23 A Comparison of Overlay Routing and Multihoming Route Control Summary: In this paper, the authors compare the performances intelligent route control of BGP and ISP multihoming, and analyze their advantages and disadvantages. They compare the relative benefits of overlay routing and multihoming route control in terms of round-trip latency, TCP connection throughput, and path availability. They also point out that, route availability and route selection are two key factors which have not been carefully evaluated in past work. They try to explore that, when using multihoming and route control, how much benefit will overlay routing provide over BGP. The multihoming scenario is simulated by selecting a few different ISPs in a metropolitan area, and use them collectively as a stand-in for a multihomed network. Totally 68 servers belonging to 17 cities' network are selected. They use several different datasets, including active HTTP downloads of small objects, the active downloads of 1 MB objects between the same set of node-pairs. Then, the performances of multihoming route control and overlay routing are compared in terms of three key metrics: round trip delay, throughput and availability. Through the experiments, the paper demonstrates that, adding capability of two important factors at end-networks will benefit BGP routing, shown as follows: 1) additional access to end-to-end BGP routes via ISP multihoming, and 2) implementation of performance and resilience-aware route control mechanisms to dynamically select among multiple BGP routes. Moreover, by comparing the relative benefits of overlay routing and intelligent route control, They find 3 important things: 1. Multihoming route control can only offer performance similar to overlay routing, not better. 2. The RTT performance of overlays is slightly better because of the ability to select shorter end-to-end routes,. 3. Route control can eliminate most observed failures on end-to-end paths, but it cannot offer the near perfect resilience of overlays . Comments: The comparison experiments are carefully designed, and many of their results are convincible. I just doubt that, to what degree the choice of ISPs will affect the results of comparison. Different ISPs may not only have different size and RTT, but also the different combination of them can provide different experiment results. For example, I am curious of which performance of multihoming will be better, comparing ISPs with sizes differ greatly or the same? The different distribution of their nodes may also affect the performance of multihoming. -------------------------------------------------------------------------------------------------------------------------------------------------------------- End-to-End Arguments in System Design Summary: This paper raises an argument of the function placement among the modules of a distributed computer system, i.e. choosing the boundaries between functions. It mainly focuses on the problems in communication network, and tries to give a design principle through investigation. As the emergence of data communication network at 80's, the argument of function placement has been already used for some years till that time. However, it has never been explicitly recognized. This paper articulates the argument explicitly, to examine its nature and to see how general it really is. Then, the paper investigate the end-to-end argument in detail. The following aspects are examined: end-to-end caretaking, performance aspects, delivery guarantees, secure transmission of data, duplicate message suppression, guaranteeing FIFO message delivery, and transaction management. After these investigation and discussion, the author concludes that, when choosing the functions to be provided in a communication subsystem, end-to-end arguments are a kind of Occam's razor, . Because the communication subsystem is frequently specified before applications that use the subsystem are known, the designer may be tempted to 'help' the users by taking on more function than necessary. Awareness of end-to-end arguments can help to reduce such temptations. Such layerings are desirable to enhance modularity. End-to-end arguments may be viewed as part of a set of rational principles for organizing such layered systems. Comments: This paper is published many years ago, and so we can examine its value today by checking the correctness of its findings. We can easily see that most of its statements are still correct and meaningful today. Clearly it is a classical paper which should be listed in the textbook. From: Bach Duy Bui Sent: Thursday, February 23, 2006 7:56 AM To: Indranil Gupta Subject: 598ig review 02/23 End-to-End Arguments in System Design 1. Summary The paper advocates the important of end-to-end function placement in system design. An example of a reliable file transfer system is first considered. Authors shows that for the system to sustain against multiple threats, it is necessary to have a file-transfer specific, end-to-end reliability guarantee application program for the reliability of data communication system itself is not enough. While high-level function placement is necessary without any doubt, the question of in which degree a function should be put in high-level versus low-level of a system is far more complicated. The answer to this question will affect much the system performance. Other examples such as delivery guarantees, secure transmission of data, duplicate message suppression, guaranteeing FIFO message delivery, transaction management also demonstrate the need of end-to-end function placement. The authors also mention that the understanding of the nature of end applications is needed for an efficient design. 2. Comments The idea of end-to-end function placement is very important in distributed system. Many distributed applications have proved the correctness of this argument. A prominent example is the Internet. The success of the internet and applications on top of it owns a part to its reliance on end-to-end applications. But end-to-end argument is not always a perfect idea. In fact, the more the functions are put higher, the less the administrators can monitor the system. That results the fact that applications have to protect themselves from adversary actions. However in many situations, these protections need a system-wide monitor and control. Examples are many internet attacks. Given now a day development of wide-are networked distributed system, it becomes more and more important for designers of such systems to have a rigorous framework that helps them to reason about function placements. Preserving Peer Replicas By Rate-Limited Sampled Voting 1. Summary In this paper, the authors present a design for and simulation of a new peer-to-peer opinion poll protocol that address these scaling and attack resistance issue. The design is evaluated in the same way that encryption systems are evaluated showing the trade off between the cost to deliver a corrupt copy of a document and the value of the document at risk. Some features of the system are cheap, need not operate quickly, and function properly for decades without central control and despite possible interference from attackers or catastrophic failures of storage media. The system is designed based on an observation that evidence that many peers independently obtained and agree with each other on the material is the best available guarantee that content is authentic and correctly preserved. A main part of the paper is the proposal of an option mechanism as a medium for peer nodes to cooperate to detect and repair damage of archival documents. Followings are basic techniques used in the system. Reference List Churning: a peer uses reference list to avoid depending on a fix set of peers for maintenance of its data. Effort Sizing and Rate Limit: a peer wants to make a change to the system consistency need to take effort, a more change requires a more effort. Intrusion detection: the protocol raises the alarm when a peer (1) polls inconclusively (2) suspects local spoofing (3) has been unable to finish the a poll for a long time. 2. Comments By using multiple techniques list churning, rate-limiting and checking consistence between repair and vote, the system is effectively protected against adversary actions. From: Long Hai Vu Sent: Thursday, February 23, 2006 7:36 AM To: Indranil Gupta Cc: Long Vu Subject: CS598ig review 02/23 CS598IG-Spring 06 Name: Long Vu Date: 02/23/06 REVIEW PAPERS End-to-end arguments in system design In this paper, a design principle is proposed to help designer place suitable functions among modules of a distributed system by giving critical examples. In essence, end-to-end argument suggests when designer should place functions in higher level of the system. This argument can also be viewed as part of a set of rational principles for layering systems. Furthermore, according to the authors, placing functions in higher layer could improve system performance and avoid redundancy. Discussion: 1. What are the drawbacks of end-to-end argument? End-to-end argument is used popularly and becomes a principle in system design. However there might be several drawbacks: - Delay: Placing functions in higher layer results in higher delay in processing message sent among nodes. This is obvious as message need to travel through higher layer immediate nodes and gets higher delay time - Flexibility: In many applications, preprocessing in lower layer could improve flexibility of system. The high layer should not take care of lower layer changes. Lower layers are close to physical aspect of system; thus, it is more heterogeneous than higher layers. Placing functions in higher layers limits the heterogeneity of underlying layers and devices and therefore makes system inflexible Preserving Peer Replicas by Rate-Limited Sampled Voting In this paper, authors introduce a design principle and deploy a test system, then they describe a new peer-to-peer opinion poll protocol in LOCKSS system. In addition, they analyze several attacks to simulate system behavior. As stated in the paper, this proposed approach is scalable and addresses resistance issues. There are some principles applied in this paper such as: cheap storage is unreliable, there is no long-term secrets, use inertia, avoid third party reputation, reduce predictability, intrusion detection is intrinsic, assume a strong adversary. There are two immediate goals in this paper: implementation of test sites and protocol’s improvement against the attrition adversary. In the future, they need to enhance the proposed model to account for peers that are either malign or loyal. Also, they need to prove the optimality of the simulated adversary strategies proposed in the paper. They get the system’s cost structure by setting parameters but they have shown how to choose these parameters’ values. The set values, therefore, are heuristic basis. From: Ravishankar Sathyam Sent: Thursday, February 23, 2006 5:06 AM To: Indranil Gupta Subject: 598ig review 2/23 Ravi Sathyam End to End Arguments in System Design Summary: This paper proposes a new design principle that helps place functions amongst the various modules of a distributed system. It proposes that function placement should be at a higher level than before, closer to the application that uses the function. As an example, in communication systems, functions guiding the communication subsystem can be correctly implemented only with the help of applications on either end of the system. So one would tend to place such functions closer to the ends (closer to the application), rather than with the system. Such an argument against low-level implementation is known as “end-to-end” implementation. One example of end-to-end implementation versus lower-level implementation of functions is in the area of file transfer. For example, if an objective is to move the file from computer A’s storage through the data transmission layer to computer B’s storage, and one threat that can occur is that the file may contain incorrect data at certain points in time due to hardware failures. Now there are two ways to counter this threat. One way is to have checksums and an application-level function that checks this checksum on computer B, and transmits this checksum back to A to compare with the original checksum to make sure the data isn’t corrupted. This is the end-to-end approach. A second approach is to provide an internal guarantee of reliable data transmission by the communication system. However, this does not still account for Threat 1 that was proposed in the paper. Hence, in other words, providing reliable data transmission on the system level does not reduce the burden to check for corrupt data at the application level. So essentially guaranteeing reliable data transmission at the system level is a tradeoff between performance and network reliability. The paper also provides other examples of end-to-end function placement. These include providing delivery guarantees, secure data transmission (can the system really be trusted to perform proper encryption, and can the network really be trusted to route the un -secure data coming out of a node?), duplicate message suppression, FIFO message delivery guarantees, and transaction management. The paper also goes on to provide examples of the end-to-end argument in other system areas, such as two-phase commit data update protocols, error control etc. Preserving peer replicas by rate-limited sampled voting Summary: This paper presents a new peer to peer protocol (LOCKSS) that incorporates rate limitation and intrusion detection by using opinion polls. This protocol works as follows amongst p2p networks where each node is a web cache containing archival units of a particular journal (AU): Assuming that each peer’s AU is subject to low rates of random undetected damage, a LOCKSS peer calls opinion polls on the contents of an AU it holds at a much higher rate than any anticipated rate of damage. A small subset of peers that have been recently encountered are invited to this poll, and they offer votes on their own version of the AU. A poll caller can get a landslide win (the incoming votes overwhelmingly agree with its own version of the AU) or a landslide loss (where incoming votes overwhelmingly disagree with its AU). In the event of a landslide loss, the poller repairs the AU by fetching the copy of a peer who disagreed and then conducts a poll again. If a landslide loss occurs again, then alarm is raised. In order for this to work, each peer maintains two peer lists for every AU it holds: a reference list (containing information about recently encountered LOCKSS peers), and a friends list (containing information about LOCKSS peers with whose operators the peer has an out-of-band relationship). Also, for every AU, the peer contains a poll counter that denotes the number of polls held for it. When a peer first enters a LOCKSS network for a particular AU, it will copy all entries from its current friends list into its reference list. Now, to call a new poll on this AU, the peer chooses N random peers from its reference list (and places them in an inner-circle) and a random poll ID, and then sends a Poll message to all the peers, and waits for the PollChallenge reply from them. Peers not giving the proper PollChallenge reply, or conflicting PollChallenge replies are “discredited” and removed from the inner-circle list. Now, for every positive PollChallenge reply, the peer sends an encrypted PollProof message, and waits for the Nominate messages from the respective peers (as usual, until the timer expires and provided that there are a sufficient amount of peers in the inner-circle). The voter will receive the PollProof and verifies it. After successful verification, the voter nominates I other peers from its own reference list to be part of the poll-originator’s outer circle list. Then the voter will construct the Vote message, which basically consists of a hash of the AU interleaved with a computational effort. If a landslide loss occurs, then the poll-originator can respond with a RepairRequest message to any one of the voter. The voter, if the RepairRequest corresponds to the correct AU that was voted on, will respond with the Repair message. In order to deter attackers, the LOCKSS system uses provable recent efforts, rate limiting, reference list churning, and obfuscation of protocol state (by encryption of all but the first message in the series of exchanges, and also by making all loyal peers go through the motions of the protocol). The paper then goes on to describe several possible attacks, such as Stealth Modification, Nuisance, Attrition, Theft, and Free-Loading, and specifically targets Stealth Modification because it possesses the greatest threat to LOCKSS. In the implementation, it specifically models the Stealth Modification attack. From the results, it is seen that with a subversion rate of upto 1/3, the stealth modification attack will fail and after that, probability of irrecoverable damage gradually increases. Given a high churn rate of the reference list, the initial subversion rate (for the adversary to bypass in order to cause damage) is very high too. Comments: What I find interesting is that this paper’s protocol already takes the Byzantine Fault Problem into account. Also, given some optimizations such as high churn rate of reference lists, one can greatly reduce probability of malicious stealth attacks. From: Raghu Kiran Ganti Sent: Wednesday, February 22, 2006 9:59 PM To: Indranil Gupta Subject: 598ig review 02/23 Review for P2P apps (Feb. 23, 2006) ----------------------------------- Paper title: End-to-End Arguments in System Design Summary/Main ideas: ------------------ This paper is more an ``idea'' paper, where a general idea is presented, without any simulations or evaluations. As such, commenting on this paper will not involve any points on strengths, weaknesses of the evaluation mechanisms chosen or real world problems (like, how does algorithm X perform when deployed in a real testbed?). This paper presents an argument for placing of modules in a distributed computer system. Their argument is that the placement of a module in lower layers (in networks) has to be carefully thought out, and propose the idea of placing the modules in end-to-end systems. Towards this, they provide examples as to why the end-user applications should do certain things as compared to leaving this functionality to lower layers. The examples considered in the paper include: careful file transfer, delivery guarantees, secure transmission of data, duplicate message suppression, guaranteeing FIFO message delivery, and transaction management. The authors observe that, certain functionalities should be provided within the application itself and not in the lower network layers. For example, to provide an error-free file transfer, the end-application needs to incorporate techniques like checksum to ensure that the file has reached correctly, instead of relying on the lower network layer to do so. But, such a choice is dependent on the range of applications that require support from the network layer for error-free transmission. If it is the norm, then including it in the network layer is a wiser choice as compared to letting the application handle it. As such, there are pros and cons to such an argument as the authors point out- Performing a function at a low level may be more efficient, if this function can be performed with minimum changes to the low-level subsystem. On the other hand, it could result in in-efficiency, as applications that do not demand such performance will be paying extra and the low-level system can be over-providing! In my opinion, it depends on what functionality is desired by the end-application, whether it is something sophisticated and only part of that particular application or if it is something basic, on which several other applications also rely on. The concept of layering of the network is thus justified, where the applications are built on top of TCP/IP. Although, there are cases where TCP is not the best way to do communications (for example, voice chat), in which case, the end application will be better off not using the lower layers. I think, the question of what to include in the end-application and what not to is quite debatable and will go on! ******************************************************************* Paper title: A Comparison of Overlay Routing and Multihoming Route Control Summary: -------- This paper explores a different routing mechanism as compared to overlays (such as the ones seen in the RON paper from MIT) and studies the performance of what is called multihoming route control to that of overlay networks. In short, multihoming route control is a method in which appropriate metrics (availability, cost) are used to choose a path. For example, a client could be connected to multiple ISPs and can choose from multiple paths one that provides higher performance. This paper studies the performance of a k-multihoming route control against k-overlay in terms of round-trip delay, throughput, and availability. The authors show that a 3-multihoming route control mechanism performs comparative to a 3-overlay network, and provides other advantages (in terms of deployment and operational overhead). Main ideas: ----------- This paper is based on a simple idea - using multiple ISPs to provide better paths. The more important part is in how to compare with overlays and what performance metrics are used while doing such a comparison. The authors address three main questions: 1. On what fraction of end-to-end paths does overlay routing outperform multihoming and what is this extent of out-performance? 2. What are the reasons for the performance differences? 3. How are the path availability rates of multihoming when compared to overlays? With the above goals in mind, the authors measured the round trip delay, throughput (TCP), and availability of a 68 node network through the means of emulation. Since, they used commercial ISPs, they cannot ``really'' implement their method, rather only provide a work-around! They observe that 1-overlay and k-multihoming perform significantly better in terms of round-trip delay and throughput compared to 1-multihoming (the traditional Internet). The key observation is that when k-multihoming is compared to k-overlays, k-overlays performs better, but with only marginal benefits. They also observed that in some cities, this performance benefit could be pronounced. Overlays provide better performance due to their ability to avoid congested ISP links. But, they also violate inter-domain routing policies, which are absent in multihoming route control. With respect to path availability, overlays perform slightly better than multihoming, although 2- and 3-multihoming do significantly better than the traditional case. Strengths: ---------- The main strength of the paper lies in the extensive analysis done and the simplistic idea presented, which does not require significant deviation from the current traditional Internet. Apart from this, I really do not see any major/novel idea in this paper. Also, it is impressive that the authors took significant pains in doing a fair comparison between k-overlays and k-multihoming route control. Weaknesses: ----------- In general a concern with deployment of newer technologies in the Internet are, how far will they be ``really'' a success? What would be the cost of providing the services to an end-user? How will it perform when deployed on a much larger scale (rather than 68 nodes)? From: Hussam Hasan Abu-Libdeh Sent: Wednesday, February 22, 2006 9:54 PM To: Indranil Gupta Subject: 598ig review 02/23 The two papers that i'd like to review are: 1. A Comparison of overlay routing and multihoming route control. 2. Preserving peer replicas by rate-limited sampled voting. A Comparison of overlay routing and multihoming route control: -------------------------------------------------------------------- Currently, Border Gateway Protocol (BGP) is used to route messages across the internet. However, end-to-end transfers suffer failures & poor performance. In response to that, some people have blamed BGP for such problems and have proposed the use of "Overlay Routing" to solve that problem. The main theme of this paper is to discuss "How much advantage is Overlay Routing going to give us over BGP when multihoming and route control are considered ?" The paper compares overlay-routing and route-control with respect to the degree of flexibility available at end-networks. That flexibility is represented by 'k', the number of ISPs available to either technique at the end-network. So, for multihoming they consider "k-multihoming" which means that we connect k ISPs, and they also introduce the term "k-overlays" which means that we have 'k' providers available for an end-point for any end-to-end overlay path. The comparisons that are performed are inconsideration to either round-trip-time (RTT), or throughput. This paper poses three main questions to answer: 1. On what fraction of end-to-end paths is overlay routing better than multihoming ? and by how much is it better ? 2. What are the reasons for one approach being better than the other ? 3. Does route control, when supplied with suf?cient ?exibility in the number of ISPs, achieve path availability rates that are comparable with overlay routing? The paper then starts measuring performance under different configurations and different scenarios. And they notice that k-overlays and k-multihoming are both better than 1-overlays, and that k-overlays are significantly better than k-multihoming for a small fraction of transfers. After that it tries to reason as to why the results turned out that way: 1. Propagation Delay & Congestion Improvement: Most of the RTT performance gains of overlay routing arise from its ability to find shorter end-to-end paths compared to BGP paths, but more importantly, they come from the ability of overlay-routing to avoid congested ISP links. 2. Inter-domain and Peering Policy Compliance: Basically, to get better performance in terms of RTT, overlay paths sometimes violate common inter-domain routing policies (as compared to BGP). The final observation of this paper was that overlay routing provides significant advantages over route control and multihoming mainly because overlays can find routes that are physically shorter (along with other factors). Next they consider cost & deployment issues. Personally, i enjoyed this paper, i really like its organization and design. The outcome was sort of expected (after giving it some thought of course!). Personally, i have always viewed multihoming as some sort of a special case of overlays, after all, overlays can span multiple networks and can act as route-control. Great paper nonetheless. ---------------------------------------------------------------- Preserving Peer Replicas By Rate-Limited Sampled Voting: ---------------------------------------------------------------- The LOCKSS project (Lots Of Copies Keep Stuff Safe) is a world-wide p2p application for preserving access to archival info. It consists of a large number of independent, low-cost, persistent web caches that cooperate to detect and repair damage to their content by voting in "opinion polls". This suggests a voting protocol that includes rate limitation and intrusion detection such that even huge attacks don't cause alot of damage before being detected. In LOCKSS, the opinion polls provide the peers with confidence in the content authenticity & integrity. Peers vote on large archival unites (AUs), this mechanism defends against free-loading (since in order to benefit from the system (repair a doc), peers must have participated in the past, thus no free-loaders will benefit). This mechanism also doesn't increase the degree of theft because peers will only supply material to other peers that have participated in the past. This paper introduces 4 types of nodes: 1. Malign: a peer that is part of a team that is trying to subvert the system. (malicious) 2. Loyal: is a good peer essentially. 3. Damaged: is a peer with a damaged AU. 4. Healthy: is a peer with the correct AU. There are 2 main roles in this protocol: poll initiator, and voter (poll participant). In this protocol, each peer maintains two peer lists for every AU it holds: the reference list contains info about other peers it has recently encountered; and the friends list contains info about peers that this peer has an out-of-band relationship with. Additionally, each peer has a poll counter for every AU that counts the number of polls called on that AU since its acquisition. The paper then describes briefly the different steps in the algorithm which are: 1. Bootstrapping 2. Poll Initiation 3. Poll Effort 4. Outer Circle Invitation 5. Vote Verification 6. Vote Tabulation 7. Repair 8. Update the lists 9. Poll Solicitation 10. Poll Effort Verification 11. Vote Construction 12. Repair Solicitation 13. Alarms (Note: personally, i thought that the authors are over classifying things, and are giving too many names and construction alot of steps that could be merged together) Next the authors discuss ways in which to ensure that malicious users won't be able to take over the system quickly. To do that, they do a bunch of things. First they make the size of the effort of malicious users large, they achieve that with the use of MBF (memory-bound-functions) that produce a proof of effort by a computational phase. They also make the task of overtaking the system a very time-intensive one, for example they limit the rate at which the attack progress to the rate of the slowest attacker's effort and the effort of its children (they also define other techniques that include obfuscation). The paper then went on to describing the different adversary capabilities and the different attacks they might generate followed by simulation and analysis of results. Personally, i have to say that i found this paper boring, with all due respect to its authors, and i found the paper "over-categorizing" everything (by the end of it they could have created a dictionary of terms that they used!!). I guess my main problem with this paper is its application. The problem of archiving is not fit for p2p systems (in my humble opinion). Gigs storage are very cheap; an institution could invest its money in purchasing multiple tera-byte servers and put them at different geographical locations and ensure periodical replication and update. I believe that this is a more economical solution. If you are devoting some machines for p2p archiving, then why not just use one of them as a server and reduce the complexity of this problem. I am a believer in the KISS (Keep It Simple Stupid) principle (well, for most of the cases), and i guess that is why i am not that fond of this paper, again with all due respect to the authors. From: Anthony Cozzie [acozzie@gmail.com] Sent: Wednesday, February 22, 2006 4:29 PM To: Indranil Gupta Subject: 598ig review 02/23 Anthony Cozzie review 23/02/2005 1. Preserving Perr Replicas by Rate-Limited Sampled Voting This paper discusses a P2P archival system for a library. Each node stores some set of academic journals, and the problem is how to perform operations (for example, replacing data lost due to a disk failure) on this data securely. The method here is simply polling. When a peer needs to check if its data is accurate, it polls N other peers and compares their checksums. The primary attack to such a method would be inserting numerous malicious nodes, but the scheme is fairly resistant: the list of nodes polled is regularly churned, and when the list is generated it is done so in a way that 1 node cannot invite a large number of compromised nodes at once. 2. A Comparison of Overlay Routing and Multihoming Route Control Someone doesn't like RON. This paper is a comparison of RON (overlay routing) and BPG multihoming (where the routers maintain multiple routes and switch among them if they decide one is down). BGP uses very simple heuristics to route packets; the paper claims this is more or less enough (~10% worse). The authors used the Akamai network and inserted small (RTT) and large (bandwidth) downloads periodically. I am not quite sure what to think of this paper; I am not sure their conclusions "multihoming is good" is supported by "multihome is expensive and slower than RON". From: Praveen Jayachandran Sent: Wednesday, February 22, 2006 6:11 PM To: Indranil Gupta Subject: cs598ig review 02/23 CS598IG Review 02/23 Praveen Jayachandran End-To-End Arguments in System Design ------------------------------------- This paper presents a design principle, called the end-to-end argument, which suggests that supporting functions entirely at the lower levels of the system may be redundant or of little value compared to the cost of supporting them. The principle has been applied in several systems in the past, without much thought about its implication. Examples such as bit-error recovery, security using encryption, duplicate message suppression, recovery from system crashes, and delivery acknowledgment are sighted to support the argument. The authors suppose that this design principle can help guide the placement of functions among the modules of a distributed system. The main argument in this paper is that functions can be correctly and completely implemented only with the knowledge of the application at the end points of the communication system. Since, the lower level subsystem at which the function is implemented, is common to many applications, those applications that do not need the function would have to pay for it anyway. Also, the lower level have lesser information and may not implement the function efficiently. Consider an application where A is transferring a file to B, over an unreliable system and network. If recovery is performed by the end systems, the communication subsystem becomes extremely simple, but the overall cost of recovery may be too high. If recovery is performed in the communication subsystem for each packet, the cost will be lower. However, an application that does not use this function would have to pay for it anyway. Thus, the end-to-end argument does not suggest which layer should implement the function. Historical examples and debates on the use of the end-to-end argument are presented. The objective of the paper to aid proper layering of subsystems and their functions is discussed. Comments: 1. As discussed in the paper, the end-to-end argument has to be applied with sufficient care. Some functions are better implemented at the lower layers to support modularity and re-usability of functions, and even to reduce cost. The end-to-end argument, however, provides no means to identify which functions are to be implemented at the lower layers and which ones are best implemented at the application layer. A Comparison of Overlay Routing and Multihoming Route Control ------------------------------------------------------------- The use of overlays have been recently proposed to circumvent the BGP routes to improve end-to-end performance and fault tolerance. This paper presents a comprehensive comparison of overlay routing and multihoming route control, and concludes that intelligent control of BGP routes coupled with ISP multihoming, can provide competitive end-to-end performance. The two mechanisms are compared in terms of round-trip latency, TCP connection throughput, and path availability. A multihoming environment is emulated using a few singly homed nodes in a metropolitan area collectively as a single multihome node. The main results of the comparison are as follows. 1-overlays perform significantly better than single-homing in terms of both round-trip times as well as throughput. Both k-multihoming and k-overlay routing offer significantly better performance compared to single-homing. For k greater than 2, k-multihoming is found to perform better than 1-overlay, which suggests that when the end-point is allowed greater flexibility in the choice of BGP paths via multihoming, the advantage of overlay routing is vastly reduced. For k greater than 2, the advantage of k-overlay routing is only marginal compared to k-multihoming. The authors also determine that a majority of the RTT performance improvements of overlay routing arises from its ability to find shorter paths. However, the most significant improvements are due to its ability to avoid congested routes. Comparing the overhead of multihoming and overlay routing, multihoming is easier to deploy, while overlay routing requires a third party to provide the infrastructure. The paper also estimates the amount of policy violations due to overlay routing. The overlay provider pays for the additional expenses that the ISP incurs due to policy violations, limiting the negative impact of overlay routing. Multihoming would require additional routing table entries which could cost more. Also, having additional overlay nodes could alter the observations in favor of overlay routing. Comments: 1. The improvements due to multihoming over overlay routing is evident only when 3 or more ISPs are used, and this could alter in favor of overlay routing with more nodes in the overlay. Although the authors conclude otherwise, it appears that overlay routing will be cheaper and more efficient than subscribing to 3 different ISPs. 2. When multihoming is used, how would the end-user be charged for the subscription. A pricing scheme based on how much traffic is routed via each ISP needs to be devised. 3. The experiments are carried out assuming an idealized form of multihoming, where the end-network has instantaneous knowledge about the performance and availability of routes, and that the end-network can control the ISP link traversed. These assumptions are unrealistic and the study could be biased towards multihoming. 4. Although, a thorough study is made, the conclusions drawn seem far from convincing. From: Juan Jose Jaramillo Jimenez Sent: Wednesday, February 22, 2006 12:11 PM To: 'Indranil Gupta' Subject: 598ig review 02/23 1 End-to-End Arguments in System Design 1.1 Summary This paper presents arguments and examples to illustrate the end-to-end argument that suggests that functions placed at low levels of a system may be redundant when compared to the cost of providing them at that low level instead of a higher layer. The authors suggest that implementing such features in lower layers can only be justified as performance enhancements. The article examines the nature of the argument and how general it can be. The argument is meant as a design principle to help designers move a function upward in a layered system. The end-to-end argument states that a function that cannot completely and correctly be implemented in a low level layer then it should be implemented in an upper layer, mainly in the application itself. The authors clarify that it would be too simplistic to conclude that the lower levels should play no part, since some effort by these levels can have a significant impact on application performance, but the key idea in the argument is that the lower layers need not to provide "perfect" solutions, making the amount of effort put on lower levels a tradeoff based on performance, rather than a requirement for correctness. The authors through example emphasize that the end-to-end argument is not an absolute rule, and that it depends on the application requirements. 1.2 Comments As the authors suggest, the end-to-end argument is not an absolute rule, but a design guideline that simply helps on placing functions at the right points. I think that it must be taken with care, since a careless application of the argument can only lead to performance degradation. To see my point, nowadays it is common to see the use of both TCP and UDP as the preferred transport protocols: both coexist and are useful for different applications. Using the end-to-end argument to say that we do not need to use TCP functionality (in-order delivery, avoiding duplicate messages, and so on) anymore because it makes more sense to implement it in the application layer only lead us to programs that take longer to develop and can have potentially more bugs since more functionality must be implemented in them. Even more, it could lead to the "reinvention of the wheel" problem, since TCP includes sophisticated tools for congestion control, that the application layer does not currently provide. 2 A Comparison of Overlay Routing and Multihoming Route Control 2.1 Summary Different studies have shown that the underlying Internet connectivity can provide better performance and resilience than end-points currently receive. Those studies have suggested that the use of overlay routing, bypassing BGP, improves both resilience and performance. The paper explores the question of whether overlay routing is indeed required, or if better route selection with BGP using ISP multihoming is sufficient. The authors state that current research projects compare BGP and overlay routing in a rather unfair way: overlay routing has nearly arbitrary control over the wide area path while BGP has limited control over the path a packet takes through an ISP network. Also, overlay routing chooses the best path to optimize end-to-end performance metrics, while BGP uses simpler heuristics, like hop count. However, the authors warn that they do not assume any changes to the BGP protocol, they only study the effect of ISP multihoming and route control in conjunction with BGP. The authors compare both approaches with three different metrics: RTT, throughput and availability. They found that multihome BGP routing can offer a slightly worse RTT performance than multihome overlay routing, while multihome BGP performs better than multihome overlay routing. In the case of availability, they found that route control is not as perfect as overlay routing, but it is sufficient for almost all observed failures. The conclusion of the paper is that overlay routing is not necessary to achieve good end-to-end resilience and performance. These goals can be achieved using ISP multihoming coupled with intelligent route control. 2.2 Comments The implicit assumption authors make is that ISP multihoming guarantees different routes, while this does not necessarily apply to all practical cases. To see this, one must realize that sometimes different ISPs can share part of the infrastructure to reduce cost, or that "smaller" or "new" providers can lease infrastructure from the incumbent ones. In this case, the performance gains obtained by ISP multihoming and route control cannot be as dramatic as stated, so results must be taken with care. However, this does not diminish the contribution of this paper, i.e., that providing route diversity in conjunction with BGP can avoid the problem of deploying overlay networks that circumvent BGP.