From: Bach Duy Bui Sent: Thursday, February 09, 2006 10:04 AM To: Indranil Gupta Subject: 598ig review 02/09 Gossip-based Ad Hoc Routing 1. Summary In this paper, the authors through experiments analyze advantage of using gossip-based communication paradigms on ad hoc routing. The discussion is based on experiments to show that the optimal parameter, i.e. gossip probability, is 0.7. Heuristics to improve performance of gossiping are then proposed including: (1) a two threshold scheme: a node gossip with higher probability if it had lower number of neighbor; (2) preventing premature gossip death: a node who has not gossiped gossips if it has not received a certain amount of the same packet; (3) retries: resend packet using gossip to increase the probability of finding a path to a node; (4) zone: using gossip in the zone routing protocol; (5) gossiping AODV: using gossip instead of flooding in AODV expanding-ring search 2. Comments Although gossip mechanism has advantage in the large scale network, say 100-1000 nodes, its probabilistic characteristic prevents it to give a predictable performance in small scale network. Since we will observe more ad hoc network in small scale than large scale, it would be good if author gave more discussion of gossip's performance in small networks. The effective usage of gossip much depends on the choice of parameters, since the value of parameters that author mentions in the paper are experiment-based without any generalization, it is skeptical that the same values can be used in other situations. For that reason, it is hard to evaluate the value of those numbers. 3. Future works Network parameters: the authors should investigate on the effect of network parameters: i.e. network topology, node degree (network density), on the effectiveness of gossip. For example: gossip may die out easier in a low density network, thus using flooding may be a better choice in this network. Mathematic proof can help authors' idea more persuasive Epidemic Algorithms for Replicated Database Maintenance 1. Summary In this paper, the authors analyze the use of various epidemic algorithms for database update problem with especial focus on practical usage. Complex epidemic schemes are introduced where different epidemic techniques are used to improve the performance of the task. These techniques involve the use of various factors: blind or feedback-based spreading, counter or probabilistic of data interest level. Simulations show that the use of more deterministic factors i.e feedback and counter does improve the delay. Push and Pull techniques are other considerations. The authors show that the combination of pull with counter and feedback improve residue. A more complicated scheme is to use complex epidemic as a back up for anti-entropy that helps in significantly reducing update traffic. The authors also discuss other design factors: deletion and death certificates, spatial distribution showing that: (1) retaining dormant death certificates at a few sites significantly improve the network's immunity to obsolete data at reasonable cost (2) a well chosen spatial distribution epidemic performs significantly better than a uniform distribution scheme. 2. Comments and Future works In this paper, authors assumes a perfect network connectivity and do not consider network topology effect. It is obvious that data delivery may fail in practical systems, thus it is worth to consider how the failure affects the use of various epidemic techniques. Pathological network topology is also another factor that should be taken into account in design of epidemics. From: ercanucan@gmail.com on behalf of Ercan Ucan Sent: Thursday, February 09, 2006 8:12 AM To: Indranil Gupta Subject: "598ig review 02/09" Epidemic Algorithms for Replicated Database Maintenance ------------------------------------------------------------------------------ Problem Addressed: Maintaining the mutual consistency of a database is an important problem when the database is replicated at many sites. This paper presents several randomized algorithms for distributing updates and driving replicas toward consistency. These algorithms are closely analogous to epidemics, and the study of epidemiology helps us understanding the behavior of these approaches. Approach Taken & Comments: In this paper, several methods for distributing the updates were presented and analyzed. First one is direct mail; in this method each new update is immediately mailed from its entry site to all other sites. According to the results in the paper, this is timely and reasonably efficient, but not totally reliable since some individual sites do not know about all other sites all the time. Also, the probability of the messages getting lost is another impediment to the reliability of this scheme. Second scheme is anti-entropy; in this method every site chooses another site at random. By exchanging database contents differences between two databases are detected. This scheme is quite reliable but there is an important issue with the performance of it. In this scheme, updates are propagated much more slowly than direct mail. The third approach is called rumor mongering; this one is a little more involved as compared to the first two schemes. Initially, sites are 'ignorant' and as soon as they get an update the update becomes a 'hot rumor' and this rumor is periodically sent to the randomly chosen sites to make sure that they have seen these updates before. When a site has tried to share its hot rumor with a lot of other sites that have already seen that update, it stops treating this update as a hot rumor. This approach is more efficient than anti-entropy performance-wise but still there is a reliability issue with this messages being propagated to the selected targets. In the first two approaches, choosing partners to exchange information creates high network traffic. The authors considered selecting the spatial distribution that supports the partners being topologically close to each other. Direct mail generates n messages for each update where n is the number of sites in the system. The traffic in this case is proportional to the number of sites times the average distance between sites. Analysis of anti-entropy is a little more involved than the direct mail analysis. Pull type converges to 0 much more quickly than push type does. So, when anti-entropy is used as a backup of some other mechanism, pull or push-pull (since this is also like pull) should be used. Then the paper talks about some design criteria that lead to 'good' epidemic protocols. They are residue (remaining susceptibles when the epidemic finishes), traffic and delay. Then types of rumor spreading are discussed in the paper. Blind vs. Feedback, Counter vs. Coin, Push vs. Pull etc. After this, several approaches are presented such as backing up a complex epidemic with anti-entropy, deletion and death certificates, dormant death certificates, anti-entropy with dormant death certificates. After this, the topic that I found interesting came in the paper. The idea of spatial distribution. The analysis of spatial distribution and anti-entropy was presented and using spatial distribution of 1/Q_s(d)^n had quite encouraging results. Overall, I think that this paper is quite instructive and could be directly put into the textbooks since it has a neat and tidy structure as compared to many other papers in the area. Bimodal Multicast ------------------------ Problem Addressed: The goal of this paper is to develop a multicast protocol that is reliable such that it can be "rigorously quantified" and includes throughput stability guarantees. This protocol is called bimodal multicast because of its reliability model. Authors claim that bimodal multicast is reliable, scalable and it provides stable delivery throughput. The protocol that the authors suggest is called pbcast (probabilistic broadcast) Approach & Comments: Best effort reliability is defined as follows: if a participating process discovers a failure, a reasonable effort is made to overcome it. This protocol is somewhat related to the paper that I mentioned above by Alan Demers. It can be seen as offering a form of wear real-time guarantee. The special thing about this protocol is that a communications graph is laid over on a set of processes and neighbors gossip to diffuse news postings in a reliable way. Here is the example from the paper: "If process A receives a news posting and then establishes communication with process B, A would offer B a copy of that news message and B solicits the copy if it has not already seen the message." There is a difference between the previous paper and this one: the frequency of database updates is low thus there was no throughput problem in the first paper. Pbcast consists of two sub-protocols. First one is an unreliable, hierarchical broadcast that makes a best-effort attempt to efficiently deliver each message to its destinations. Second one a two-phase anti-entropy protocol that was discussed in the first paper as well. Anti-entropy protocol operates in a series of unsynchronized rounds. In each round, first phase is used to detect the message loss and the second phase is run on demand if there a message losses which were detected by the first phase. First phase can be done using IP multicast, if IP multicast is available in the system. Otherwise, a randomized dissemination protocol would be used in the first phase. This idea randomized protocol is very similar to the protocol that was used in OceanStore, which is a persistent data storage system. It basically pseudo-randomly generates spanning tree for broadcasting messages to the entire group. It seems like the current tree dissemination (the first phase) of pbcast is not scalable to thousands of nodes. In the experiments they showed that it works fine with hundreds of nodes however, it probably not for thousands and millions networks. As a future work, making pbcast to scale to larger networks could be considered. An improvement that they have made to the regular gossip protocol is that pbcast emphasizes achieving a common suffix of message histories rather than a common prefix. This looks like a clever improvement to me in the sense of overall performance of the system. Protocol prioritizes the recovery of recent messages. Also, when a message becomes old enough the protocol gives up entirely and marks the message as lost. Next, the seven important optimizations that are applied to the design were discussed. Those are soft-failure detection, round retransmission limit, cyclic retransmissions, most-recent-first retransmission, independent numbering of rounds, and random graphs for scalability and multicast for some retransmissions. -- Ercan Ucan - eucan2@uiuc.edu Graduate Student Computer Science Department University of Illinois at Urbana-Champaign ------------------------------------------------------------ From: Hussam Hasan Abu-Libdeh Sent: Wednesday, February 08, 2006 11:02 PM To: Indranil Gupta Subject: 598ig review 02/09 The two papers that i read and would like to review are: 1. Gossip-Based Ad Hoc Routing. 2. Epidemic Algorithms for Replicated Database Maintenance. Gossip-Based Ad Hoc Routing: In a nutshell, this paper discusses the use of gossip (sending messages based on probabilities and random throws) to route messages in wireless ad hoc network instead of flooding. The aim of this paper was to provide studies and analysis that the use of gossip can actually lower the number of messages sent, thus decreasing the sent traffic while still routing the message successfully (with a highly probability). This paper used the fact that the physical layer of wireless networks is multicast by nature, thus we don't have to send the same message to all neighboring nodes separately (just group it in one message). This paper defines two main variables: The probability that a node will send a message (forward) is equal to 'p'. And the probability that a receiving node will forward that message is equal to '1-p'. The second variable that this paper uses is 'k', which is the number of hops after which this probabilistic method of gossiping kicks into effect. The use of k is just to ensure a higher probability that a message won't die out quickly. From this paper i learned about the bimodal effect observed in ad hoc wireless networks. This effect simply put is just the observation that in such systems with gossip messages, it is either the case that most of the nodes receive the gossip message, or most of the clients don't receive that gossip message (basically, the outcome could be one of two extremes). This bimodal effect is quite interesting from my point of view, i never thought it would be the case. Anyway, this bimodal effect turns out to be quite useful in deriving mathematical upper/lower bounds on the probability of success of routing over multiple executions of the algorithm. After that the paper goes into experimental results that they have simulated and the savings they got by using gossiping over flooding, and explain how to use gossiping with AODV. Personally, i thought the paper was good overall, i liked the author's way of explaining stuff (it was as if i was reading a good textbook). However, their mathematical proof was sort of diving into a dark cave, i felt it wasn't that clear. But, the paper was able to successfully get back up after that section :-). ----------------- Epidemic Algorithms for Replicated Database Maintenance: In summary, this paper talks about the use of nondeterministic randomized algorithms (instead of deterministic algorithms) in the context of replicating databases. It touches alot (and bases its arguments) on the subject of epidemics. The three main methods that are discussed are: 1. Direct mail: which is the most intuitive idea, just send your updates to everybody on your list immediately. 2. Anti-entropy (and its derivatives): uses gossip-style messages to inform random servers of updates. This method (and its sisters/derivatives) require more time than direct mail because of the probabilistic factor. 3. Rumor mongering: which closely emulates human behavior in spreading rumors. Basically a server picks a server that it knows about and informs it about its updates. if after a while it notices that alot of its contacts already know about that update it stops (because that rumor is not hot any more!). It is obvious that both Anti-entropy, and Rumor mongering don't guarantee 100% delivery of all updates. However, the chances of success are very high. I had a couple of personal thoughts when i was reading this paper: First of all, are we really willing to use probabilistic algorithms in such big mission critical applications such as replicating databases ? I know that the chances of success are great, but still, how much of a chance are we willing to take here ? The second question that came to mind was: I noticed that in many algorithms discussed thus far, we have been emulating the behavior of a human society to a great extent (Epidemics, Rumors, Gossiping, some of the p2p protocols ..etc) and i couldn't help but think: would it be an advantage or a disadvantage if we isolated ourselves from those frames of reference and looked at the problems without any background knowledge of "human distributed systems"? I don't claim to have any such frame of reference, but i think it is interesting to know whether mimicking human life is an advantage or disadvantage! --Hussam From: Long Hai Vu Sent: Wednesday, February 08, 2006 10:09 PM To: Indranil Gupta Cc: Long Vu Subject: 598ig review 02/09 CS598IG-Spring 2006 Name: Long Vu Date: 02/09/2006 Review papers 1. Gossip-Based Ad hoc routing In this paper, authors propose a gossip-based approach which probabilistically propagates routing messages based on a gossip probability. This reduces the routing overhead among nodes in a network. With the probability between 0.6 and 08, substantial number of nodes gets the message. They stated in the paper that in almost all executions of the ad hoc routing algorithm, either virtually no nodes receive the message or most of them do. Thus, the goal of this paper is how to find the fraction of executions where both the number of dead messages and gossip probability are low to reduce number of message overhead and ensure the message propagation process. Discussion In the basic gossiping protocol, a node broadcasts a message with probability p, if it receives the same message after the first time, it discards this message. So, how to set the buffer to remember “history” of message? In the “high traffic” network with arbitrary topology, this buffer needs to be managed efficiently. For GOSSIP2(p1,k,p2,n) protocol, how to set the parameters for a network whose arbitrary topology? They use the average degree of node in a network to set these parameters. But, it seems inefficient for the networks in which nodes have so different number of neighbors, i.e. some nodes have many neighbors, some have a few. Are there any ways to set there parameters “adaptively” with the topology of the network? Do not fix these parameters for all nodes in the network. The proposed GOSSIP protocol concentrates on sufficiently well connected networks, could we use GOSSIP protocol for mobility networks where nodes are mobile? In this case, we have dynamic topology and dynamic number of neighbors. How many retries we have to make? Future work suggestion: an adaptive GOSSIP protocol which can be aware of network’s changes. 2. Epidemic algorithm for replicated database maintenance This paper focuses on one of the most critical issue of distributed database: how to update data efficiently and consistently. In the paper, some randomized algorithms are proposed for use to update and replicate data in large scaled databases. There are two difficulties: propagation time to update the whole system with multiple sites and network traffic during the update. The goal of this paper is to solve these difficulties and spread data over the distributed system. Authors examine three methods including direct mail, anti-entropy, and rumor mongering in this paper. The approach is also implemented on Clearinghouse server of Xerox Corporate Internet. Discussion For complex epidemics, could we add some heuristics like distance between the sender and other sites in the same network to spread the data? The closer sites have higher priority. If we choose sites randomly, the far sites may be contacted before the near sites, this may generate traffic complexity. Could we use the idea of Kelips to divide system into affinity groups of sites and then design the protocol for spreading data inside an affinity group and across groups? From: Maifi Khan [maifi.khan@gmail.com] Sent: Wednesday, February 08, 2006 9:20 PM To: Indranil Gupta Subject: 598ig review 02/09 Title: Epidemic Algorithms for replicated database maintenance. Summary: This paper describes several randomized algorithms which are analogues to epidemics to distribute updates among databases replicated at many sites. They use techniques from epidemiology to explain the behavior of these algorithms. They divides the sites as infective which holds an update willing to share, susceptible that has not received the update yet, removed which has received the update but no longer wants to share. The database copy of a data item is a time varying partial function. For distributing updates, a pair with a larger timestamp will always supercede one with a smaller timestamp. Direct mail strategy tries to notify all other sites by sending mail. Anti-entropy in its basic form choose another site at random and exchange the whole database contents to resolves any difference between the two. One modification is every sites maintain a checksum of its database contents and sites first exchange checksums to decide if exchange of database is needed or not. Rumor mongering is a complex epidemic style where when a site receives a new update it becomes hot rumor and choose another site at random to send the update. If it chooses a site that has already seen the update for a number of times, it stops spreading the update. They also describe the push and pull variations of this technique. Anti-entropy can also be used to back up rumor mongering. To handle deletion of items, they introduced the concept of death certificates to prevent the reincarnation of old copies. Criticism: 1. As the algorithms described in this paper are randomized, they cannot guarantee about the convergence time of distributing updates. It makes such algorithms unsuitable for time critical application where we need guaranteed service time limit. 2. Direct mail is not guaranteed to work as messages may be deleted at mail server or exact set of sites may not be known to the initiator of updates. 3. Anti entropy is communication expensive. 4.For death certificate, there is tradeoff for space and possibility of reincarnation of deleted items. If a site is disconnected for longer time than the holding time of death certificate, old copies will be reincarnated. 5. Algorithms presented are Sensitive to network topology. Future work: 1. Based on different capacity among servers in terms of storage and performance, a hierarchical structure among servers may be optimal in distributing updates. 2. More robust and adaptive algorithms based on network topology is needed for better performance. Title: Gossip based Ad Hoc routing. Summary: This paper proposes a gossiping based technique for routing messages where each node forwards a message with some probability. Gossiping exhibits bimodal behavior. They describe multiple variation of gossiping algorithm. GOSSIP1(p) is the basic protocol where a node broadcasts the newly received message to its neighbors with probability p. To improve the performance of GOSSIP1, they described GOSSIP2(p1,k,p2,n) where p1 is same as p, k is the number of hops with which gossiping start with probability 1, the node with fewer than n neighbors gossip with probability p2>p1. GOSSIP3(p,k,m) is proposed to prevent premature deaths of gossip, nodes can monitor the number of copies it receives from its neighbors and if it is below some prespecified threshold m, it broadcasts the message to all its neighbors after timeout. GOSSIP4(p,k,k') uses the technique of ZRP(Zone routing protocol) where each node has a zone of radius k'. It is experimentally shown that GOSSIP technique uses less messages than flooding techniques. Criticism: 1. Due to bimodal behavior either most of the nodes get the message or hardly any of the nodes gets the message. This kind of behavior is not suitable for all type of applications. 2. Performance of algorithm is highly dependent on degree of network. For higher degree, the required gossip probability is higher and vice versa. 3. The algorithm suffers from boundary effect- the nodes close to boundary shows decrease in probability of receiving messages. Future work: 1. Further work is required to determine how to specify the gossip parameters for different types of network.