From: qwang26@uiuc.edu Sent: Tuesday, February 12, 2008 9:59 AM To: Gupta, Indranil Subject: CS525 review 02/12 Exploring the energy-latency trade-off for broadcasts in energy-saving sensor networks M. J. Miller, C. Sengul, I. Gupta Sensor networks are always resource-constraint, where sensor nodes have limited computing power, memory and energy (battery lifetime). Whereas, broadcast, which is an important tool in sensor networks, is energy and latency sensitive. In this paper, authors concern a scenario of energy-conserving sensor network. Their goal is to design a probablistic broadcast protocol in wireless sensor networks with tradeoff between energy, latency and reliability. The main contribution of this paper is that: proposing a new energy-efficient broadcast protocol for WSNs, and analyze the tradeoff relationship between energy and reliability, latency and reliability, besides, accomodate their protocol with existing sleeping-based applications as well as current MAC protocols. In particular, they introduce two parameters: p, which is the probability that a node rebroadcasts a packet in the current active time even if some of its neighbors are asleep; q, which is the probability that a node is awake after the active time when it normally would sleep. By selecting different values of p and q, they can tune the relability and energy consuming around the threshold level. Epidemic algorithms for replicated database maintenance A. Demers, D. Greence, C. Hauser, W. Irish, J. Larson This paper considers a database replicated at many sites in a large heterogeneous, slightly unreliable and slowly changing network of hundreds of sites. Each database update is injected at a single site and must be propagated to all the other sites or supplanted by a later update. The goal of this paper is to design a algorithm that is efficient and robust and that scales gracefully as the number of sites increases. The authors discuss two epidemic algorithms to update the database: anti-entropy, where each site regularly chooses another site at random and by exchanging database contects with it resolves any differnces between the two; remor mongering, where a site becomes a hot rumor when receiving a new update, and a site holding a hot rumor periodcially chooses another site at random and ensures that the other site has seen the update. In particular, the author decript a complicated case that an old item is removed from the database. To acheive that with robustness, they adopt DEAD certificate mechanism together with the above two epidemic algorithms. In addition, in order to reduce the traffic generated by anti-entropy without unacceptably increasing its convergence time, they introduce nonuniform spatial distribution. From: hossein.ahmadi@gmail.com on behalf of Hossein Ahmadi [hahmadi2@uiuc.edu] Sent: Tuesday, February 12, 2008 9:21 AM To: indy@cs.uiuc.edu Subject: 525 review 02/12 \title{CS525 Paper Review : Epidemics} \author{Hossein Ahmadi\\ hahmadi2@uiuc.edu} \section{Epidemic algorithms for replicated database maintenance} The paper addresses the problem of performing updates and deletes in a replicated database in which several copies of the same record are stored in different sites. Propagating updates initiated in one site can be very costly if it is possible at all. The idea of the paper is to use epidemic algorithms instead of directly sending a propagated update or delete. In particular, authors describe a simple epidemic algorithm (Anti-entropy) as well as a complex epidemic algorithm (Rumor mongering) and claim that the best performance is achievable through a combination of both. In Anti-entropy algorithm, each site periodically compares its database with another random site and makes the required updates. In Rumor-mongering algorithm, each site stores a set of recently updated record and propagate them randomly to other neighbors. After some time, the site removes a record from set of recent changes due to different policies presented in the paper. The paper also describes the modifications required to handle delete operation where some old versions of a record can be considered as an update when a site deletes a record. The paper presents a general design approach for an epidemic algorithm which is very interesting. Different aspects of each approach has been well presented and studied. For example, push, pull, and their combination, along with rumor-mongering policies are vastly explained and studied through analysis or simulation. Therefore, the reader have a clear idea of pros and cons of each design approach. There are some issues needed to be resolved. There are several optimization techniques for the database synchronization used in anti-entropy algorithm. For example, a randomized approach which chooses some records randomly to be synchronized may be effective. Another issue is the interval at which the site tries to perform an anti-entropy. It is highly dependent to the frequency of updates from end users and the system needs to adapt with it. Also, it is desirable that during rumor-mongering nodes can act more intelligently. For example, they can incorporate the topology information to infer the nodes with lowest probability of having the rumor. One fundamental question is whether the epidemic algorithms are effective in the scenario of fixed replicated databases or not. The authors compare epidemic algorithms with the most naive way of propagating an update, but using some more complicated algorithm may even have a better result than epidemic algorithms. The point is that the topology information and nodes are relatively static over time with predictable communication capacities. Therefore, the problem can be modeled in a deterministic manner effectively where no epidemic scheme is needed. \section{Exploring the energy-latency trade-off for broadcasts in energy-saving sensor networks} This paper presents the trade-off between energy consumption and latency-reliability in broadcast applications in wireless sensor networks. Authors first propose a new protocol called PBBF which probabilistically forwards broadcast messages as well as probabilistic sleep and wake-up scheduling. Specifically, PBBF employs two parameters, $p$ and $q$, which are the probability of forwarding a broadcast and probability at which a node wakes up during its sleep interval to listen to broadcast packets. Now, the problem is how to tune $p$ and $q$ in order to obtain a desirable operational point in terms of energy consumption and reliability. The paper performs an analytical study on the trade-off between energy and latency or equivalently reliability of a broadcast. Next, the protocol is evaluated through simulation and the impact of design parameters such as $p$, $q$, or network parameters such as deployment density has been studied. The paper is well written and gives a clear presentation of the idea. The proposed PBBF is not a completely new protocol since both probabilistic forwarding and probabilistic sleeping protocols have been proposed before. However, the main contribution of this paper is the analytical results on the trade-off between energy consumption and latency-reliability. The paper also presents a through evaluation of the protocol which helps better understanding the design issues. From: rebolledodaniel@gmail.com on behalf of Daniel Rebolledo Samper [dreboll2@uiuc.edu] Sent: Tuesday, February 12, 2008 9:14 AM To: Gupta, Indranil Subject: 525 review 02/12 1. EPIDEMIC ALGORITHMS FOR REPLICATED DATABASE MAINTENANCE In this paper, the authors explore probabilistic techniques to synchronize a distributed database. First, they present three approaches to inserting new records: the naïve or "direct mail" technique, based on contacting other replicas; "anti-entropy" which relies on random pair-wise comparisons of processes' copies; and variations of rumor mongering, called "complex epidemics". These include a discussion of mixed approaches (rumor mongering and anti-entropy, for example) to guarantee that all processes are updated eventually. Second, they extend these techniques to deleting records. To delete a record, they actually insert a new record called a "death certificate" and rely on the algorithms described earlier to propagate them. In order to save spaces, death certificates must be deleted eventually, but there is always a possibility that a copy of the underlying record remains. Therefore, some "dormant" death certificates are placed on a small number of machines for longer than usual, and they act as an acquired immune response to the original record, getting "activated" if they encounter it. Finally, the authors refine these algorithms by forcing nodes to talk only to "nearby" nodes. They examine the impact of several notions of proximity and measure their performance in the Xerox Corporate Internet, comparing it to the Clearinghouse name service which had been using the more inefficient naïve technique. The paper presents a large array of "building blocks" for protocols based on epidemics, like anti-entropy vs. rumor mongering, push vs. pull, blind vs. feedback, and does a good job at exploring the main combinations of these building blocks. The important point here, is that it demonstrates how simple algorithmic techniques can be combined to compensate for the deficiencies of each one individually. It also shows that randomized algorithms can efficiently replace their deterministic counterparts while retaining strong correctness guarantees. Another important point is that they test the applicability of their mathematical models (which may be very good in theory) to the problem at hand and then fine-tune them accordingly. Maybe an aspect that was somehow lacking in their paper was an empiric validation of the mathematical models. They suggest that they in validated some of their models in p. 6 but some additional simulations would perhaps have uncovered some deviations from what the theory predicts. 2. BIMODAL MULTICAST This paper presents the probabilistic broadcast protocol. It is a two-phase multicast protocol based on epidemics: in the first phase, they use an a priori unreliable protocol to send the message to as many processes as possible, and then use anti-entropy to compensate for lost packets. Their protocol is designed to provide not so much atomicity (though they have a similar but weaker property) but a stable throughput. They achieve this by altering the anti-entropy algorithm to recover packets in decreasing order of age so that a small number of lagging processes can't drag the overall performance of the system. They also give seven important optimizations that make the algorithm viable in practice. The authors draw ideas and methods from probability theory to analyze their protocol formally, but they also conduct experiments to validate their analyses, and also to explore additional behavior: they compare the performance of their system to CMU's Ensemble system (in essence extending this multicast framework). They also run simulations in NS2 against the Scalable Reliable Multicast protocol and look at a WAN implementation of their protocol in a small LAN. An interesting point of the article is that they show how a small optimization like forcing processes to recover only the most recent messages they've missed results in a dramatic increase in performance when compared to regular protocols. Their approach is also remarkably resilient to noise and transient errors, and it only shows significant degradation in extreme cases of both unreliability and throughput. The system also seems quite scalable as the delay from the originating node to the end nodes seems to follow a logarithmic trend relative to the total number of nodes. However, it seems that many of their protocol's advantages derive from the willingness to accept weaker atomicity guarantees. For example, they're willing to accept that with low, but non-negligible probability, a given message ultimately fails to reach, say, 5% of clients. Of course they explain that many applications can tolerate these losses, most notably multimedia applications and in general application whose state depends only weakly on past states. However, their claim that this is acceptable in more life-critical situations needs to be bolstered, in my opinion. From: Rahul Malik [rmalik4@uiuc.edu] Sent: Tuesday, February 12, 2008 8:54 AM To: Gupta, Indranil Subject: 525 review 02/12 Submitted by: Rahul Malik rmalik4@uiuc.edu EPIDEMIC ALGORITHMS FOR REPLICATED DATABASE MANAGEMENT SUMMARY: In this paper, first of all authors discuss three major techniques for spreading updates on a distributed system. The first technique is by direct mail, in which the updates are directly mailed to all the other sites from the site where it originated. Global timestamps using time are used to determine whether an update is a new one or not. The other two approaches are based on randomization. The second approach is anti-entropy, where a site is chosen at random and the contents of databases of the two sites are exchanged and mutual differences are overcome. Since exchanging the databases is a costly operation, authors describe few solutions such as maintaining a recent window list and inverted index so as to make this operation efficient. Finally, the third approach discussed is rumor mongering, which is based on epidemiology. When a rumor is hot, it tends to spread fast. The authors have described a detailed analysis of this approach and have presented various ways in which one can find whether a rumor is hot. Next, the paper discusses about deletion because we can't expect to delete the local copy at one site and expect it to spread. Instead opposite would happen. So, death certificates are issued and ways of keeping it are discussed. In the end, authors have discussed spatial distributions because choosing the sites completely at random uniformly instead of favoring nearby stations leads to much more network traffic. PROS: In this paper, authors have discussed many techniques for spreading the updates. One of the strong points is that they do not consider very strong guarantees from the underlying network. The approach using rumor mongering is a new one and is very often used now a days for various purposes. Thus, this is a significant contribution of the paper. They have also provided a strong analysis based on equations. CONS: One of the major drawback of the paper, which I think, is that they have not done studies of all their algorithms on a real testbed. If that is done, then it can be used to strongly back up their theory. This is what I suggest what should be done. Also, they do not take into account the packet drops in the network and the issue of security in a network. They have also not described explicitly that how one discovers various sites present in the network. FUTURE WORK: They must test their protocol on a real test-bed so that more quantitative results can be obtained. Also, they should compare their protocol with existing ones so that comparative assessment can be performed. QUESTIONS: 1)How does push-pull operation works and its need? 2)Analysis of T(n) for the spatial distribution is not given? 3)How to solve for “s” in the equation for complex epidemics? EXPLORING THE ENERGY-LATENCY TRADE-OFF FOR BROADCAST IN ENERGY-SAVING SENSOR NETWORKS SUMMARY: In this paper, authors describe Probability-Based Broadcast Forwarding protocol for forwarding broadcasts in a sensor node network. They have considered the trade off between energy-latency and reliability. The reason for this is that as the latency is tried to be reduced by keeping the sensor node awake for longer periods, the energy consumption goes up and vice-versa. To address the battery consumption, MAC protocols have sleep mode in which very little power is consumed. However, they are deterministic in the approach. Here, they have presented a probabilistic way of choosing the probabilities. So, the authors have described two design parameters p and q, which are related to whether a node should broadcast a packet immediately and whether it should stay awake during the period of the schedule when it should be sleeping. Next, the authors have presented analytical results for reliability, energy, latency and their trade offs. They have used the analysis from the percolation theory for analyzing their approach. In particular, they have used bond percolation model for analysis. Finally, the authors have also validated their model using simulation results on ns-2 network simulator. They have shown the effect of q parameter and the effect of node density on the protocol. PROS: One of the strong points of the paper is that their protocol can be incorporated in the standard MAC protocol which allows sleep scheduling mechanism. Also, it does not require and extra hardware component such as additional low power wake-up radio for its functioning. Thus, it is robust in this regard. In the paper, the authors have done a lot of analysis of the protocol theoretically. This is one of the strong points. CONS: In this paper, there is not much scientific contribution. The analysis is done using standard techniques of the percolation model. Also, they assume an ideal MAC and physical layer with no collisions or interference for their models, which is not at all true in practice. Thus, their theoretical formulation is weak in this regard. Even for simulations, they do not describe the parameters and the model which used for analysis on ns-2 protocol. FUTURE WORK: I suggest that a more thorough analysis must be performed theoretically by taking into consideration the interference and collisions. Also, instead of choosing p and q statically, analysis must be performed so that they are dynamically chosen and updated by the nodes. QUESTIONS: 1)How to choose p and q in actual practice for an implementation? From: Md Yusuf Sarwar Uddin [mduddin2@uiuc.edu] Sent: Tuesday, February 12, 2008 2:07 AM To: Gupta, Indranil Subject: 525 review 02/12 Energy-latency trade-off for broadcasts The authors present a modification to 802.11PSM to support network wide multi-hop broadcasts (i.e., flooding) achieving low latency and good coverage. Usually in 802.11PSM operations, nodes are synchronized by a beacon interval (BI), and they all wake up periodically at the beginning of the beacon interval. Each BI duration has two components, an ATIM duration when nodes who have data to send to some neighbor announces it in order to schedule the receiver to be active in the rest of BI. The nodes who do not have anything to send or receive sleep rest of the BI. Generally when a node receives any packet in its BI, it cannot forward it other node because the next hop might be sleeping at that time, instead it waits up to the next BI to declare it and then forward. The authors brings a change here for handling broadcasts. Instead of waiting for the next BI to forward the broadcast, why doesn't the receiver immediately rebroadcast it, since any of the neighbor node can be its recipient? But the neighbors are sleeping at that time! Why don't let some of them be wake up at that interval? That's the trick. With some probability p, a broadcast receiver decides whether it will forward the packet immediately in the same interval or wait for the next BI, and with some probability q nodes remain alive to catch such packets, although they were originally scheduled to sleep at that interval. The authors argue that this could reduce the latency of broadcasts with slight increase of energy. Pros: - A probabilistic attempt to reduce latency of network wide broadcast in 802.11PSM. - By using percolation mod els, the authors analyze the coverage of the broadcast into the whole network. Obviously, it's not 100%, but can be made very close to it by tuning p and q. - Sound mathematical relations between energy and latency are presented. Cons: - 802.11PSM mainly focuses on energy saving with the cost of performance, delay and throughput. But here the authors are trying to reduce delay with the cost of energy, that goes somewhat opposite of 802.11PSM's intention. - Network wide broadcast does not seem to be an important operation for which more energy can be warranted while reducing latency. Of course, if it were intended to support some end-to-end flow, then it could be a more interesting problem. - The authors present the scheme for sensors networks. Do sensor networks use 802.11PSM at all as their MAC? Comments: - How does 802.11PSM' usually broadcast? The broadcasting node declares data for all neighbors at ATIM, and eventually makes all of its neighbors to be active in that period, while some of them might have already received the broadcast packet earlier. What happens if some nodes inadvertently go to sleep without receiving the broadcast, although they were told to receive that? Does it happen to be a epidemic like flooding? - Can the scheme be extended to support some arbitrary or particular flow for low latency in 802.11PSM? Clearly, this requires more coordination among the forwarders of the flow, rather than random wake/sleep. Epidemic algorithm for replicated database maintenance The authors present a set of strategies for spreading updates in a replicated database maintenance system. The proposed methods are 'direct mail', 'anti-entropy' and 'rumor mongering'. Direct mail simply sends notifications to all replicas to reflect updates among the entire system soon after an update occurs, but it's costly and prone to fail to reach all nodes simultaneously. Anti-entropy is a gossip based protocol where a node randomly picks another node, and matches the entire database to adjust latest mismatches. With its three variants (pull/push/push-pull), anti-entropy runs a lazy update across the system and converges in moderate count of rounds, but it requires entire database matching and it takes long time (though, some timestamps and checksum based techniques are suggested to reduce this time). In 'rumor mongering', whenever any replica node encounters any update to any of its data item, it randomly picks another node to “infect” that node with this update. In addition to using the two states (as appears in 'simple' epidemics) namely 'infective' and 'non-infective' (susceptible), 'rumor mongering' brings another state into action, called 'removed' -- when an infective node makes an attempt to talk another node that is already heard of updates (infected), the caller node becomes 'removed' and does not participate in further gossiping with some probability. The authors showed that this change significantly reduces the update traffic in the network without affecting the coverage and convergence time. Some variations to the base scheme like blind/feedback, counter/coin are also introduced. Comments: - It's a pretty early paper and a seminal work. Comprehensive mathematic explanation behind almost every intuition is a beautiful matter in this paper. Possibly due to this work, we currently know of 'lazy update' method in distributed data replication. - While forming differential equation for complex epidemics, the authors omitted some constants that usually appear in the mathematical model of epidemic. For example, they wrote, ds/dt = -s*i, but it's preferable to be something, ds/dt = - alpha * s * i, with some constant alpha. This alpha determines the rate at which the 'infection' is being spread into the network. - The presented schemes are mostly probabilistic and does not guarantee anything to its entirety. But the most exciting thing about the randomized algorithms is that they are very simple...simpler to understand, simpler to analyze and simpler to implement than their deterministic counterpart. The question is how reliable the solutions are. Can it happen that on some bad instance time, all random choices are gone mad and the entire system collapsed? - Interestingly enough the paper doesn't contain any plot, although it presents numeric experimental results. Plots are more friendly to understand the underlying relations among the parameters and their impact on outputs. From: dkassa2@uiuc.edu Sent: Tuesday, February 12, 2008 1:37 AM To: indy@cs.uiuc.edu Subject: 525 review 02/12 Review 1: Paper Title: Exploring the Energy-Latency Trade-off for Broadcasts in Energy-Saving Sensor Networks The paper proposes a new Probability-Based Broadcast Forwarding (PBBF) protocol for wireless sensor networks (WSNs) and thereby explores their energy-latency-reliability trade-offs. Unlike previous studies of probabilistic broadcast in wireless networks which work outside the MAC protocol the proposed PBBF works at the MAC layer and can be integrated into any sleep scheduling protocol. PBBF can help an application designer to tune the system to an appropriate operating point along the reliability-resource-performance spectrum. The paper simulated (using ns2), and measured the performance of a class of probabilistic broadcast protocols for multi-hop WSNs and quantified the energy-latency trade-off required to obtain a given level of reliability using PBBF. The simulation experiments indicate that PBBF is an efficient broadcast mechanism. The paper makes a good contribution in the area of WSN. I believe that it is presented in sufficient details. It would be more convincing and good if PBBF was implemented in real world wireless sensor network environment. Apart from this I have no serious negative comments against the paper. ================================================ Review 2 Paper Title: EPIDEMIC ALGORITHMS FOR REPLICATED DATABASE MAINTENANCE The paper presents randomized algorithms for distributing updates and driving the replicas toward consistency in database systems. The algorithms which are similar to epidemics ensure that the effect of every update is eventually reflected in all replicas though they require few guarantees from the underlying communication system. Appropriate distributions are used to tune the cost and performance of the algorithms in the randomization step. The paper examines through simulation and practical experience several methods such as the Direct Mail, Anti-Entropy and Rumor Mongering used for spreading updates. By using well chosen spatial distribution for selecting anti-entropy partners the implemented randomized anti-entropy algorithm reduces average link traffic considerably when compared with another algorithm. Rumor mongering as an efficient replacement for the initial mailing step for distributing updates shows promise. Generally the paper uses a couple of seemingly basic concepts. I found the paper to be a bit scattered and less focused though it gives sufficient details. From: jmtulloss@gmail.com on behalf of Justin Tulloss [tulloss2@uiuc.edu] Sent: Tuesday, February 12, 2008 1:34 AM To: Gupta, Indranil Subject: 525 review 02/12 Hi Indy, Here are my reviews for tomorrow. Thanks! Justin Epidemic Algorithms for Replicated Database Management This paper was largely a case study on how Xerox used epidemics to efficiently communicate database changes over a large number of database sites. They were primarily focused on reducing traffic as their old mechanisms were very traffic intensive. The company's network (CIN) did have a Anti-Entropy simple epidemic working already, and this offered several benefits. For instance, it was guaranteed to eventually propagate all the changes to every database site. The question was how quickly that could happen. The company used to use a direct mail approach where every change was broadcasted to every site the updated site knew of, but this had several obvious defects, one of which was excessive traffic. They began experimenting with rumor mongering to cut down on traffic while still ensuring that as many sites as possible were properly updated. The paper goes through the mathematical justification of this approach as well as the effects of several parameters on the effectiveness of rumor mongering and eventually comes out heavily in favor of a rumor mongering setup. They found that when they took their network topology into account and had an anti-entropy backup, they had far less traffic, more effective distribution of updates, and eventual guaranteed propagation, with very fast propagation in the average case. I found this article to be quite interesting. They set out to improve on a clearly naive baseline (direct mail) and were able to easily improve on that process. Beyond that, the article is largely a survey of the various tradeoffs in implementing rumor mongering. This makes is a great reference article if you want to look into how to tweak your own rumor mongering implementation, but offers little beyond this. Comparing it to other methods of database maintenance would be difficult given only this paper. Exploring the Energy-Latency Trade-off for Broadcasts in Energy-Saving Sensor Networks This paper detailed PBBF, an alteration of the MAC protocol that allowed for sensors to make decisions on whether they should sleep or broadcast based on tunable parameters. This particular implementation of PBBF was based off of IEEE 802.11 PBS, though any MAC implementation that supported sleep-active cycles should work in a similar fashion. The paper argued that by adding two parameters, a systems designer could find the appropriate tradeoff of energy usage, latency, and reliability for his particular setup, and even change the parameters as the network changed. The first parameter, p, was the probability that the sensor rebroadcasts a message immediately instead of waiting until the next active cycle. The second parameter, q, was the probability that the node would skip a sleep cycle, thereby catching any instantaneous broadcasts. The remainder of the paper described the results and resulting tradeoffs of implementing PBBF. It was found that PBBF could be tuned to substantially reduce latency even while guaranteeing a desired level of reliability. However, energy use suffered beyond the base case of p=q=0, though it increased linearly as q increased. The paper concluded by emphasizing the efficiency and flexibility of the system and suggesting that there existed substantial room for further research in auto-tuning of p and q. I felt that this paper did a good job of opening up discussion on the ideas the authors were presenting. The paper clearly demonstrated that even with hard constraints on systems, an efficient software design can make it easy to hit the targeted tradeoff of these constraints. At the same time, I did not think that there was enough evidence presented to take PBBF to the next level. There was not a marked enough improvement over the simpler PBS scheme to justify deployment in a real system. Of course, that may never have been the authors' intention. It does provide a direction of thought, in that by putting decision making capabilities into every layer of software, we are left with very powerful systems that are very tunable. This is immensely valuable in my mind as a designer does not necessarily know when he's designing a solution to a problem how the problem may change in the future. By designing solutions that allow the system to adapt to different conditions, the problem does not need to be solved more than once, even as it evolves into a different problem entirely. From: Riccardo Crepaldi [crepric@gmail.com] Sent: Tuesday, February 12, 2008 1:07 AM To: Gupta, Indranil Subject: 525 review 02/12 “Epidemic Algorithms for Replicated Database Maintenance” This paper focuses on the replica management of a Database in use at Xerox over their Corporate Internet Network. The motivation of this research comes from the consideration that the growing dimension of such a system (in 1987) showed the need for a efficient replication system, since the links were very slow and faulty, thus requiring optimized traffic generation and robust replica strategies. The three strategies combined in this case study are Direct Mail, where the updates are simply sent to all the nodes in the network, Anti-entropy, that randomly exchanges updates among couple of sites, and Rumor mongering, that uses a cheap dissemination algorithm based on a probabilistic approach. A particular emphasis is given to the description of these three strategies that are widely used in many later systems. The performance evaluation is partly expressed by means of probabilistic model and partly using empirical results. An important aspect that is highlighted by this case study is that in the Xerox CIN the bottleneck were due to the small capacity of network links, and then many of the choices in the implementation of this algorithms were necessarily driven by a prior knowledge of the network topology. The paper is very clear and specifies with good details the algorithms implemented. When analyzing it, looking for possible pros and cons it is very important to be aware that in 1987, when the paper was written, the networking world was completely different from what we know today. The idea of epidemic algorithms is very interesting and novel for the time. However the paper lacks a little of a general view, concentrating too much on the case study and justifying many of the choices that could not be proven using a model, providing results from experiments that are very dependent from the Xerox infrastructure. Some of the points that would be interesting to investigate is the real effectiveness of the Death Certificates management, that seems fragile and prone to pathological situations when timeouts can expire before the system reached a stable configuration. Additionally it would be very interesting to revise the paper using a modern database system, in order to understand if the constraints and the bottlenecks are still the same as in 1987, or if the growth of networks capacity and reliability and database dimensions (and the changes in their applications) justify the same set of assumptions or should lead to different conclusions. Additionally the network topologies that the paper describes are now overlays over the Internet topology, and this should probably invalidate some of the considerations and assumption made in the paper. ------------------------------------------------------------------------------ “Exploring the Energy-Latency Trade off for Broadcasts in Energy- Saving Sensor Networks” This paper present a probabilistic approach to the problem of reliable data dissemination using broadcasting in wireless sensor networks. The proposed solution, Probability-Based Broadcast Forwarding (PBBF) is not a new MAC protocol but instead a sleep scheduling strategy more complex than the standard fixed duty-cycle approach. The PBBF protocol makes use of two parameters, p and q, that fix the probability that a node rebroadcast immediately a received packet (p), and that a node remains awake in the period it should be asleep, waiting for a possible communication (q). The reasons for the use of a probabilistic approach is that the usual deterministic approach requires a strong coordination and implies longer delays in the communications, as well as energy waste due to redundant receptions. The paper provides several considerations, analysis and experimental results, showing how the choice of the parameters q and p has a strong influence on reliability, latency and Energy consumption of the Broadcasting algorithm. The strongest points of this work are the simplicity of the solution proposed, that fits with no or small effort on the top of the sleep schedule of the majority of MAC algorithms known in literature, and generalizes the concept of sleep scheduling. Additionally, both the analysis and the simulations provide a good description of the protocol behavior and effectiveness, and allow the final user to have an idea of how the choice of the p and q parameters will influence the algorithm performance. The future work at the end of the paper summarizes well the points that could be improved in this research. Will it be possible and useful an adaptive scheme that varies the p and q parameters dynamically? And how does this system compares with other power saving schemes? Additionally it would be interesting to provide a performance evaluation of PBBF applied to a gossip protocol instead of a standard Broadcasting. From: Justin King [kingjkk@gmail.com] on behalf of Justin King [king1@uiuc.edu] Sent: Tuesday, February 12, 2008 12:46 AM To: Gupta, Indranil Subject: 525 review 02/12 Epidemic Algorithms for Replicated Database Management Demers, et al., 1987 This paper presents three strategies for spreading updates in a distributed database. The first two strategies, "Direct mail" and "Anti-entropy", are shown to have reliability and overhead issues in real systems. However, the third strategy, "Rumor mongering", is essentially an epidemic/probabalistic algorithm with paramaters tuned such that we achieve broadcast of database transactions with a high degree of confidence. In rare cases, broadcast may not be achieved (in the terms of the authors, there may be a positive number of both susceptible and removed copies of the database, but no infective copies). "Anti-entropy" can be used in limited quantities to further diminish this case. Upon these strategies, a variety of epidemic algorithms can be implemented. Positives: - Paper presents significant mathematical work on their algorithms, with fully worked-out examples. - Algorithms are implemented in real, production systems. Negatives: - Perhaps hindsight is 20/20, but I felt that the paper spent too much time on Deletion/Death Certificate strategies, which seem fairly obvious to me. This, though, may be the advantage of 20 years more research on Distributed systems upon which I can pull. - I believe the introduction section is too long: 2 pages for 11.5 total pages of text. My preference is for very concise, 1-page introductions. I would move Section 0.3 to follow 0, and make Sections 0.1 and 0.2 their own sections (with a more concise motivation in Section 0). ---------- Exploring the Energy-Latency Trade-off for Broadcasts in Energy-Saving Sensor Networks Miller, Sengul, and Gupta, 2005 This paper demonstrates a new broadcast protocol for energy-sensitive sensor networks called PBBF. 2 simple parameters are used to tune the performance of PBBF to the desired levels, to achieve correct and efficient broadcast while conserving the most energy possible. Simulation was done demonstrating these tradeoffs, and thorough results are presented in the paper. Positives: - Intersection of ideas: PBBF protocol has epidemic algorithm roots, but is in the sensor network area. - Very well-written introduction. I knew exactly what to expect from just reading the first 1.25 pages, including motivation, related work, key contributions, and an overview of the simulation results. Negatives: - I found several of the graphs difficult to understand at first glance. I had to find their corresponding text in the bowels of the paper in order to read some of them; I prefer to read the graphs first, and then pick out small details in the text. From: Chandrasekar Ramachandran [cramach2@uiuc.edu] Sent: Tuesday, February 12, 2008 12:43 AM To: indy@cs.uiuc.edu Subject: 525 review 02/12 A)Epidemic algorithms for replicated database maintenance, A. Demers et al, PODC 1987. Summary: This is a seminal work, which proposes several randomized algorithms that provide methods to spread and replicate updates among all distributed sites. The authors have proposed 3 epidemic algorithms that are simple and easy to implement. An important and recurring feature is the choice of non-uniformity in choice of partner sites. Additionally by randomizing several aspects of the algorithms, best estimates of completion of updates rather than exact values are provided. Pros: 1.A smooth flow of information from one section to another, and accompanied by sufficient background explanation. 2.The example of CIN topology’s spatial distribution with Anti-Entropy is particularly helpful to understand the role of this algorithm. 3.Using the activation threshold to avoid incorrect cancellations by the reactivated death certificate is a great idea,and its interplay with the normal threshold is interesting. Cons/Suggestions: 1.Simulations for spatial distribution using rumors not particularly conclusive. 2.The role of connection limits in affecting push-pull, push and pull could be explained further. B) Exploring the energy-latency trade-off for broadcasts in energy-saving sensor networks, M. Miller, C. Sengul, I. Gupta, ICDCS 2005 Summary: This paper proposes a method to understand how energy-consumed at nodes, the latency involved and the associated reliability are affected by broadcasts. The technique proposed, PBBF, is examined at a reliability boundary. Overall, this paper shows an excellent design of a non-deterministic and tunable algorithm that can be integrated into any application. The application proposed is for code distribution in Wireless Sensor Networks. Pros: 1.The experimental results show the efficient performance of PBBF especially when the q parameter is large, and especially well for nodes away from source nodes. 2.The support for application developers by helping in setting up of the q and p parameters. 3.The visual representation for demarcation of the reliability level. Cons/suggestions: 1.The average latency could be experimented for node failures. From: emenese2@uiuc.edu Sent: Monday, February 11, 2008 10:46 PM To: Gupta, Indranil Subject: 525 review 02/12 Paper: Bimodal Multicast Reviewer: Esteban Meneses The word “reliability” has different meanings according to two families of communication systems that provide support for reliable multicast communication. The first family of protocols is related to the strong reliability properties, that commonly include atomicity (if a multicast is delivered to any non-faulty node, it will eventually be delivered in all non-faulty nodes). It has been shown that these protocols hardly scale beyond several hundred participant, even with a very stable network. The second family includes protocols focus on best-effort reliability over a very large scale network. These protocols scale well even when message loss and failures occur. However, they don't provide an end-to-end reliability guarantee. In fact, this aspect is not very well defined concept for such protocols. Then, the programmer of a system must decide between protocols that don't scale but provide a clear sense of reliability and protocols that scale well but have a loose definition for reliability. Bimodal multicast is something in between those two families. The sense of reliability is guided by the “almost none or almost all” principle. As a bimodal probability distribution, this protocol can be configured to have a very small probability of delivering a multicast to a few nodes, a negligible probability of delivering to several but not most nodes and a high probability of delivering to all nodes. A pbcast (name of bimodal multicast) must have the following properties: atomicity (almost all or almost nothing), throughput stability, FIFO ordering per-sender in message delivery, multicast stability, detection of lost messages and scalability. The way it works is by having two phases. The first consists in a unreliable hierarchical broadcast that makes a best-effort attempt to deliver the multicast. However, the second phase is an anti-entropy (gossip) protocol that permit sites to catch up with the latest messages sent. This protocol has the ability to be configurable according to the conditions of the real system and also it has shown stability in the throughput. There is however a trade-off in the protocol. It is known that a reliable protocol with a lagging process can impact throughput and latencies in the whole group. The pbcast doesn't suffer this problem, but a complementary one. If a process lags far behind the rest of processes in the group, these processes may garbage collect the messages in the history and leave the slow process in a partitioned section of the group. The paper presents an interesting idea of having an equilibrium between best-effort and strong properties multicast protocols. This intermediate family permits to offer a probability guarantee in the atomicity of the multicast. I wonder what other areas are susceptible to the same idea: failure-detectors, replica management and so on. Also, the same effect happens with logical clocks. Vector clocks define a complete notion of causality in a DS, but they are not scalable. Lamport clocks on the other side are scalable, but not “complete”. The plausible clocks approach is something in between that offers a probability guarantee in the causality of events. I liked a lot that the paper offers several experiments for supporting the arguments. Although the paper is quite long, it is easy to read and understand, specially for the initial motivation of the General. The authors argument that a slow process that doesn't catch up with the rest of the system (an that will eventually be partitioned by the pbcast) is not a big problem, in the sense that the slow process in not a healthy one. I don't agree with that, because in the Internet there are a lot of transient conditions where connections get slow for a period of time and suddenly they recover. I didn't see any real data about failure rates in the experimentation results. The results are too conservative, but the Internet is not conservative at all (in that sense). Also, there is no results about the extra overhead in messages for having the second-phase in the pbcast. Perhaps this number is not negligible. Paper: Epidemic Algorithms for Replicated Database Maintenance Reviewer: Esteban Meneses There are several algorithms for keeping a replicated database consistent. However, most of those algorithms suffer from high bandwidth consumption or restrictive requirements for executing every operation on the database. The main idea of the paper is to offer a randomized algorithm for distributing updates in the database and making the replicas be consistent. The difference between the proposed technique and the previous approaches strives in two factors. First, the previous mechanisms depend on various guarantees for underlying communication protocols and on keeping control structures in a consistent way. The epidemic algorithm doesn't require such data structures. Second, the epidemic algorithm is randomized, which means that in several points of the algorithm a node is meant to take a random choice, distinguishing from the pure deterministic approaches. The epidemic algorithm follows several stages for keeping the replicated database consistent with a new update. A database is susceptible (it doesn't have the update), infective (it has the update and it is sharing it with other replicas) or removed (although it has the update, it not sharing it). Now, one stage of the algorithm is to spread an update via an infection-style protocol. This means that when an update is received by a node, it will randomly pick several other replicas to propagate the update. However, this mechanism alone cannot guarantee that all the replicas received the update. Another stage, called anti-entropy will try to solve this problem by randomly locating the lagged nodes. This phase can run in the background as the other stages are running. The strategy consists in that every one randomly picks another node and resolve the differences. Using this back-up mechanism it is very unlikely that an update hasn't reached the whole set of replicas. Another important stage consists in having death certificates, that appear when objects are deleted from the database. Then, although updates about deleted object X can be around the system, when they face a death certificate, they will stop being spread. The big advantage of this approach is its simplicity, that doesn't required complicated control mechanisms or the updating of several data structures in order to keep the consistency of a replicated database. Also, the speed with such an update is spread is also an advantage. The main drawback is that there is no 100% percent of certainty that an update will reach all the replicas. Also, there is an overhead for the anti-entropy mechanism that is running in the background, trying to discover which nodes don't have still an update. This paper introduces the application of epidemic theory (with a mathematical background) to the control of replicas in a database, by means of a gossip-based mechanism. In general rumors are an efficient mechanism for sharing information. However, we have seen that there is a risk of the rumors in not reaching everybody. If the rumor is about you, be sure you will be the last hearing it (if you hear it). This same idea can be applied to every problem that involves the sharing of information, from agreement to group communication. In networks, it seems to be interesting how problems can be tackled with this approach: routing (already done), streaming control, etc. I wonder if there is a kind of timed-epidemic algorithm. In this case we have a real-time limit DELTA to reach the whole system. I don't understand why the authors still rely on physical clocks. Instead of them (which we all know are difficult and expensive to synchronize) I would use logical clocks. In the epidemics algorithm there is a parameter k that can be modified according to how sure we want to be that everybody hears the rumor. However, there was no experimental study between the value of k and the bandwidth consumption. The paper didn't make clear how distances area computed. Is there any possibility for dynamic distances in the algorithm? What is the effect of incorporating spatial distributions compared to not considering them at all? Finally, what happens with process failures? I think that should be a main concern in consistency management for databases. From: marefin2@uiuc.edu Sent: Monday, February 11, 2008 8:26 PM To: Gupta, Indranil Subject: 525 review 02/12 Review 1: EPIDEMIC ALGORITHM FOR REPLICATED DATABASE MAINTENANCE Alan Demers, Dan Greene, Carl Hauser, Wes Irish, John Larson The paper is mainly about maintaining the consistency of the replicated database in case of update and delete. The paper mentions three different techniques for maintaining consistency, Direct Mail, Anti-Entropy and Rumor Mongering. Direct mail is just sending an email to each site about the update of the database. But there is no reliability that each site should receive the update, some mails can be discarded due to the buffer overflow. In the anti-entropy technique, each site randomly picks another site and exchanges information. This comes into three flavors, push, pull or push-pull. An analysis shows that pull or push-pull is preferable to push in case of Anti-Entropy. Anti-entropy is quite expensive in terms of latency but a reliable technique. Another solution is that each infected site picks up another site randomly and makes it infected with the update if the later site is susceptible. This is a good approach according to the mathematical analysis and simulation results. Rumor Mongering uses this approach. Here the latency is low, but it cannot assure that all sites are updated. So, another idea is to use anti-entropy as a back for rumor mongering technique. The paper also addresses how to delete some value from the database in a consistent manner. One solution to use death certificate and keep the certificate for a certain period. Another solution can be the use of dormant death certificate. In this manner, most of the death certificates are removed but some are dormant with timestamp. When some deleted values come across a dormant certificate, the dormant certificate becomes active and again starts removing the old value. This can be costly to send update to a distance site instead of sending to a nearby site. But the paper shows that spatial distribution of the nodes can improve the network traffic for both anti-entropy and rumor mongering process. The paper shows the classical approaches in the area of epidemic. · The paper describes three very simple algorithms that require few guarantees from the underlying network. · They authors have also considered the time and network traffic for propagating the update. The simulation shows that in rumor mongering, residue (number of nodes that miss the update) reduces exponentially with the increase of k, which is a parameter that decides how each site loses interest to spread the update. Also the spatial distribution can reduce the network traffic significantly. · Still there are some network topologies and parameter conditions that rumor mongering process couldn’t give the proper solution. · Also for death certificate, they have not specified how to measure the duration of keeping death certificate in the system instead of keeping it in some arbitrary period (or 30 days as in the paper). This can be a bottleneck. None of the methods described here can be used alone efficiently for replicated database. So, the final output tells to use the rumor mongering with the anti-entropy as the back. Isn’t it possible to design a single approach to support the efficient replicated system? Some of the future works of this project might have involved the finding of an algorithm that works for all pathological topologies. Also I think different current algorithms to manage different transactions in replicated systems are based on these approaches. Review 2: EXPLORING THE ENERGY-LATENCY TRADE-OFF FOR BROADCASTS IN ENERGY-SAVING SENSOR NETWORKS Matthew J. Miller, Cigdem Sengul, Indranil Gupta This paper is mainly on the energy-latency trade-off for broadcast in energy-saving sensor networks. The authors have proposed a new energy-saving technique for broadcasting in sensor network. The approach is based on the probabilistic protocol in broadcast. In the current IEEE 802.11 PSM protocol, each node broadcasts at the beginning of ATIM window and wait for next wakeup for rebroadcast. This increases the latency of broadcasting. PBBF introduces two probabilistic variable p, the probability that a node rebroadcast a packet in the current active time despite the fact that not all neighbors may be awake to receive the broadcast and another variable is q, the probability that a node remains on after the active time when it normally would sleep. In this PBBF approach, a node, after sending one message, can send another message without waiting for next ATIM window depending on the p parameter. Also another node can stay awake depending on the q parameter even if its active time is over. The application designer can use PBBF threshold protocol to set the values of p and q so that they are just across the reliability threshold region and can tune these values according to desired energy-latency trade-off. The analytical study finally shows that [in equation 8 and 9], the energy consumed at a node increases linearly with q but latency is inversely related to q and p. Thus, energy and latency are inversely related to each other in PBBF. Finally the simulation result of ns-2 shows the output that supports the previous analytical results. There are several mentionable points on this approach. · Sensor nodes are getting small day by day and the smallest sensor today is about the size of a gold atom. It is still the subject of research that how to allow more power in these types of small sensor motes. So the contribution of the authors about the energy saving broadcast is a good and necessary research outcome. · The classical approach here is to relate the latency and energy in broadcasting with this p and q parameters. They have used strong analytical and simulation result to support their PBBF techniques. · But the paper doesn’t include comparison of PBBF to other adaptive sleep protocols. There are some other protocols in sensor network. How the PBBF method outperforms the other available techniques, could be good information. · Also, figure 9 in the paper shows that with the increase of value q, the flooding hop count (for average 20-hop) first increases and then decreases for PBBF-0.5 and PBBF-0.75. There is not enough explanation in the paper about this behavior. The future works are quite obvious. Also the authors have pointed out some at the end of the paper. This includes how to adjust p and q parameters dynamically to improve PBBF performance and how to make it more efficient compared to other adaptive sleep protocol. From: Mirko Montanari [mirko.montanari@gmail.com] on behalf of Mirko Montanari [mmontan2@uiuc.edu] Sent: Monday, February 11, 2008 8:24 PM To: Gupta, Indranil Subject: 525 review 02/12 ===---===---===---===---===---===---=== Review of “Epidemic Algorithm for Replicated Database Maintenance” by Mirko Montanari This work describes a solution for efficiently replicate databases in a distributed system. The authors introduce a new algorithm, called rumor mongering, that, when used in conjunction with a previously known algorithm, called anti-entropy, can manage the updates of replicas in a distributed database system. In the proposed approach, the first algorithm (rumor mongering) is simple algorithm that disseminate single updates that happen in one of the database sites. The second algorithm (anti-entropy) is an expensive algorithm that is used to completely synchronize two database replicas. Rumor mongering is based on the epidemic principle of rumor spreading: when a site receives an update (rumor), it starts spreading it to other randomly selected sites. The paper analyses different strategies for the choice of which sites should be targeted for dissemination with an higher probability and when a site should stop spreading the update. The rumor mongering algorithm does not guarantee that all sites will receive an update (the rumor spreading could stop before everyone heard of it). The anti-entropy algorithm is used in background to complete check the consistency between two randomly-selected site: one of the two sites reads the content of the other database and updates its sets of records appropriately. This process, run once in a while, guarantees that every update will be eventually received by everyone. The paper also proposes a new way to handle deleted records. For every deleted record a “death certificate” is created to the record and disseminated. To avoid that a record could be resurrected by an out-of- date copy that did not receive the death certificate, dormant certificates are left on specific nodes even after the end of the update dissemination. These dormant certificates are activated and will become updates again if the delete records reappears on the database. The paper presents a good theoretical analysis of the algorithms by using the theory of epidemics to determine asymptotic bounds on the performances, both for the rumor mongering and the anti-entropy. The authors compared different ways to choose when each node should stop disseminating the message and analyzed different spatial distribution that can be used to disseminate messages. Spatial distribution that gives more probability of selection to closer sites can be preferred over a uniform spatial distribution because it significantly reduce traffic without big increase in the convergence time. A possible cons of the rumor mongering algorithm is the fact that the anti-entropy algorithm has to be used in order to guarantee that the updates will be received by all the nodes of the network. A future work could try to define an epidemic algorithm that does not depend on two separated processes for the dissemination and that is able to give guarantees of delivery through the use of random pulls from sites that have not received updates in a while. ===---===---===---===---===---===---=== Review of “Exploring the Energy-Latency Trade-off for Broadcasts in Energy-Saving Sensor Networks” This wok describes a broadcast protocol that can be integrated into an existing MAC level sleep scheduling. Two parameters of this protocol, p and q, determine the energy that will be used to disseminate a message and the latency with which the broadcast will be received by all the nodes. These two characteristics of the protocol depends on the particular application, so the choice of the parameter is left to the application developer. The proposed protocol works in following way: when a node receives a broadcast, the node can decide to rebroadcast it immediately with probability p without assuring that the neighbor are actually listening to messages. Every node, with probability q, can decide to not go into sleep immediately after its activity phase, but decide to stay awake and wait for broadcasts that are immediately retransmitted by its neighbors. This work reports an analytical probabilistic analysis of the protocol that explicits the relation between q and p with energy consumption and dissemination latency. The analytical results are backed up by simulations that confirms the expected results. The probabilistic analysis is able to give analytical results that could be used at application design time to determine the value of the parameters p and q. This work is specifically focused on broadcasts in sensor networks where there is full synchronization between the wake-up times of nodes. Even if this is probably the most common case, sometimes wake- up time for sensor networks might not be fully synchronized in this way. A sensor network could synchronize its wake-up time in order to guarantee a good sampling rate of a specific phenomenon. Even if the algorithm is applicable in this cases, the probabilistic analysis of the tradeoff might not be applicable. From: Anthony Cozzie [acozzie@gmail.com] Sent: Monday, February 11, 2008 4:10 PM To: Gupta, Indranil Subject: 525 review 02/12 Review: Demers & Epidemics The original gossip paper deals with multicast in the arena of replicated databases, where each database desires to propagate its updates to the others. Contains all the original insights of how epidemics can be simple and reliable, with reasonable overhead, and how to analyze them with differential equations. Covers push/pull, various counter schemes, epidemics of "anti information" aka death certificates, spatial gossipping where the topology of the network is used to improve efficiency, and so on. Notes: What can I say? It's a seminal work, and I read a ton of gossip based distributed systems papers last year. Review: PBBF One of the biggest challenges of sensor networks is conserving battery power. It turns out that the radio is often a bigger power hog than the CPU/memory, and there are many schemes to reduce power usage in sensor networks. This particular paper deals with sleep mode: a radio is usually in one of three states: transmitting (hi power), receiving (medium power), and sleep (very low power). The goal is to have the radio in sleep mode as much as possible while still providing QoS. 802.11 solves this problem by synchronizing the sleep cycles of its nodes. Nodes that wish to transmit must send ATIM packets to let other nodes know that packets are incoming. PBBF is based on two insights. In a gossip-type broadcast, a node must wait for the next ATIM window to forward a packet (neighbor set of 1 != neighbor set of 2), thus increasing latency, and that if all nodes forward the packet there will often be duplicate transmissions due to the implicit multicast of wireless communication. PBBF introduces two parameters, p the probability that a node immediately rebroadcasts without an ATIM, and q the probability that a node remains on after the ATIM. p trades off latency and reliability, while q trades off energy and reliability. Results come from simulations in ns2. The key figure is #12, which allows a protocol designer to trade off between 0.5 joules of energy (but 11 hops) and 3 joules of energy (but 2 hops). Notes: I would have liked to have seen a little more on node density. Obviously this has a huge impact, but it is relegated to single paragraph at the end. In theory figure 12 should become a 3-d contour plot. Gossip protocols seem extremely well suited to wireless networks due to the implicit multicast of a wireless transmission: the overhead of determining whether or not a message was actually received and by who is relatively larger in comparison to the transmission than in a wired network, and this overhead is eliminated by a "push" gossip protocol. From: Hengzhi Zhong [hzhong@uiuc.edu] Sent: Monday, February 11, 2008 1:03 PM To: Gupta, Indranil Subject: 525 review 02/12 Epidemic Algorithms for Replicated Database Maintenance Summary: Traditionally, the algorithms for consistency are complex and deterministic. This paper presents several simple randomized algorithms for maintaining consistency for replicated databases. The algorithms are Direct Mail, Anti-entropy, and Rumor Mongering. Direct Mail is moderately efficient, but can be unreliable. Anti-entropy is inefficient but reliable. Rumor Mongering is a complex epidemic algorithm with failure probability that can be made small. It is more efficient than Anti-entropy. One strategy is to first use Rumor Mongering, then back it up with Anti-entropy. The paper talks about using dormant death certificate and two types of timestamps to delete old items from databases. When a dormant death certificate sees an old data at some site, the death certificate is reactivated and propagates to all sites. The paper uses a spatial distribution to reduce network traffic and “hot spots”. Pros: 1. simple randomized algorithms can be used to maintain consistency 2. use spatial distribution to reduce network traffic and "hot spots" 3. the algorithms in the paper do not depend on server-specific data structure 4. discussed when to use pull, push, and push-pull 5. uses dormant death certificates with epidemic algorithms to delete old item from the database 6. reformulate anti-entropy as an epidemic algorithm, which allows people to think about other epidemic algorithms possible for maintaining consistency Cons: 1. randomized algorithms are probabilistic, so we do not know exactly when information is distributed to all sites 2. how to pick tau1 and tau2 for dormant death certificates? this is not clear from the paper. 3. space required for dormant death certificates is O(log n) at each server for large networks --------------------- Exploring the Energy-Latency Trade-off for Broadcasts in Energy-Saving Sensor Networks Summary: In this paper, a protocol PBBF is proposed to explore various trade-offs for energy, latency, and reliability. It is a probability-based protocol. PBBF works at the MAC layer and can be integrated into any sleep scheduling protocol. This protocol can help people to design a wireless sensor network WSN application that achieves a specific level of energy, latency, and reliability. To do so, PBBF introduces p, the probability of a node rebroadcasts in the current active time, and q, the probability that a node remains on when it should sleep. Pros: 1. an analysis on the trade-offs 2. PBBF allows people to tune the performance of their WSN application by changing the values of p and q 3. PBBF can be used with any sleep scheduling protocol 4. PBBF uses the bond percolation theory to analyze reliability and shows that there is a direct relationship between p and q for a given level of reliability. Cons: 1. the networks considered are grid-based, where each node is connected to four neighbors. Does this protocol still work for non-grid topology? 2. in simulations, the number of nodes N is only 50. Is N too small? What happens when N is large? Is PBBF scalable?