FREE ELECTRONIC LIBRARY - Abstract, dissertation, book

Pages:     | 1 |   ...   | 21 | 22 || 24 | 25 |

«Approximate Information Filtering in Structured Peer-to-Peer Networks Christian Zimmer Max-Planck Institute for Informatics Saarbrücken ...»

-- [ Page 23 ] --

• Address-space load balancing tries to partition the address-space of the DHT evenly among keys. This problem can be seen as solved by relying on consistent hashing (i.e., main property of DHT overlay networks) and constructions such as virtual peers. With virtual peers, each real peer pretends to be several distinct peers, each participating independently in the DHT protocol. The load of a peer is thus determined by summing the load of all its virtual peers. The Chord DHT [SMK+ 01] is based on consistent hashing and needs O(log n) copies to be operated for every peer [RLS+ 03]. Drawbacks of virtual peers are increased storage requirement and higher network bandwidth demand. In [KR06], the authors solve the drawbacks by activating only one of the virtual peers at any given time. The work presented in [BKadH05] considers the task of keeping a system balanced in presence of peers leaving and joining the network. The goals is to keep the lengths of intervals assigned to peers differ at most by a constant factor. The proposed scheme works in a constant number of rounds and achieves an optimal balance with high probability.

• Item load balancing aims to directly balance the distribution of items among peers in the network. Here, an arbitrary load distribution of items is assumed, i.e., some keys are more popular or appear more frequent than others. [ADH05] and [KR06] are prominent work addressing this issue. A simple and cost-effective approach for item load balancing is presented in [BCM03] applying the power of two choices paradigm.

There, an item is stored at the less loaded of two or more random alternatives.

[RPW04] presents a simple approach for balancing stored data items between peers in a fashion analogous to the dissipation of heat energy in materials. This algorithm improves the distribution of load in DHTs without requiring major changes of the DHTs themselves. In addition, the fault tolerance of P2P systems is increased by the proposed algorithm. Another approach for load balancing is described in [SVF08] and considers differences in peer capacities. In addition, load balance in heavily loaded P2P systems is investigated.

- 141 Chapter 8 Conclusion and Open Questions

Regarding the MAPS approach presented in this thesis, load balancing affects addressspace and item load balancing at the level of the distributed metadata directory. Beyond that, the distribution of active subscriptions is also important since highly monitored publishers may become a bottleneck. Overall, the following load balancing questions arise and

have to be addressed in future work:

–  –  –

• Load balancing is also important regarding active subscriptions. Publisher peers that store many subscriptions for requested continuous queries have to send a high number of notifications for new published documents that match these subscriptions. Thus, load balancing depends also on the number of publications since each publisher peer can adjust the number of publications to its capacities. Decreasing the publication rate also decreases the load for sending notifications because publishers with a low publication rate are less monitored.

Load balancing in P2P information management systems, including the presented MAPS approach, is a very important research topic for future work. Although existing balancing techniques can be applied, there still exist some additional issues that need more sophisticated balancing strategies.

–  –  –

8.3.2 Dynamics Another open problem for future work involves the effect of dynamics in P2P networks.

Within a P2P network, the churn rate 2 refers to the number of peers leaving the system during a given time period (e.g., an hour, or a year). Churn is a significant problem for large-scale systems, as they must maintain consistent information about peers in the system in order to operate effectively. For example, with a large churn rate it may be impossible to index file locations, making some files inaccessible even though the peers that store these files are online. In addition to peers leaving the network, dynamics in such a setting also includes peers joining or even changing content in an abrupt way.

[KEAAH05] presents an analytical study of churn using a master-equation-based approach using the Chord DHT. For any churn and stabilization rate, and any system size, the authors predict the fraction of failed or incorrect successor and finger pointers (see Section 2.1.3). Earlier work [RGRK04, CCR04, LNBK02] considered theoretical studies of asymptotic performance bounds of DHTs under churn or simulation-based studies.

Discussing the influence of dynamics for the MAPS architecture has to consider several aspects including the directory maintenance, the subscriber and publisher behavior. Since publishing documents is a main part of the information filtering setting, data dynamics is

not a concern in MAPS. Thus dynamics in MAPS only covers leaving and joining peers:

• The impact of publishers that leave or join the MAPS network is small. Whenever a publisher joins the network, it has to update the metadata stored in the distributed directory. If it is not the first time that this publisher joins (e.g., the digital library had some downtime), the publishing behavior is recognizable. Subscriptions from previous participation are still valid for future publications. Publishers leaving the MAPS network can not publish documents any more until the next join. A subscriber is not able to send a query to a publisher that has left the network. So, the number of monitored publishers needs to be increased (depending on the expected or estimated number of inactive publishers) to ensure a predefined number of monitored publishers.

• Similarly to the previous discussion, the impact of leaving or joining subscribers is minor. Whenever a subscriber leaves the network, it can not request new continuous queries, although Its active subscriptions will stay valid and issue notifications for matching publications. Of course, the subscriber will not receive the notification, so protocols that support storing notifications in the directory seem to be a necessary system component (see Section 7.3). Then, a subscriber rejoining the MAPS network receives the stored notifications for matching publications during its downtime.

• The effect of peer churn on directory maintenance is a challenging problem that might affect both the effectiveness and efficiency of the system. If a new directory peer joins the MAPS system, it is responsible for a certain subset of the key space. Thus, there are two possibilities to handle joins: (i) the directory peer waits for updated metadata from publishers, but until then, the directory cannot provide metadata for certain keys; (ii) the joining protocol includes the dissemination of existing metadata by exploiting the consistent hashing property. In this context, a critical issue that arises is the departure of directory peers from the network without prior notification or without following the network disconnection protocol. If so, the metadata for keys that the directory peer was responsible for are no longer available. In this case, publisher peers have to update their statistics to ensure a proper directory maintenance.

2 The phrase is based on the English idiom to churn up, meaning to agitate or produce violent motion

–  –  –

A possible solution for the problem of churn on directory maintenance is replication:

more than one directory peers may store metadata for a key to ensure the availability of directory statistics. This issue will be discussed in the next section.

8.3.3 Replication & Caching An interesting open question for future work involves dealing with replication & caching.

Replication is the process of sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, faulttolerance, or accessibility. It could be data replication if the same data is stored on multiple storage devices, or computation replication if the same computing task is executed many times. Typically, the access to a replicated entity is uniform with access to a single, nonreplicated entity. The replication itself should be transparent to an external user. Also, in a failure scenario, a failover of replicas is hidden as much as possible.

In computer science, a cache is a collection of data duplicating original values stored elsewhere or computed earlier, where the original data is expensive to fetch (owing to longer access time) or to compute, compared to the cost of reading the cache. In other words, a cache is a temporary storage area where frequently accessed data can be stored for rapid access. Once the data is stored in the cache, future use can be made by accessing the cached copy rather than re-fetching or recomputing the original data, so that the average access time is shorter. Cache, therefore, helps expedite data access that the CPU would otherwise need to fetch from main memory.

Both caching and replication are also two popular topics in distributed information management. [RL05] compares two popular redundancy schemes: replication and erasure encoding. Replication is also considered in both DHT protocols used in MAPS (Chord [RFH+ 01] and Pastry [RD01b]) whereas erasure encoding is investigated in [DLS+ 04, KBC+ 00]. In general, replication denotes the simple redundancy scheme where k identical copies of each data object are kept in the system. The erasure encoding redundancy scheme means that each data object divided into m fragments is recorded into n fragments stored separately with n m. The key property of erasure encoding is that the original data object can be reconstructed from any m fragments. In [ZBW08], a cache-aware, multi-round query routing strategy is developed that balances between query efficiency and result quality in distributed P2P information retrieval (e.g., for the Minerva architecture). In addition to this Exact Caching (EC) strategy, the aggressive reuse of the cached results of even subsets of a query is proposed as the Approximate Caching (AC) technique. This strategy can drastically reduce the bandwidth overheads. Older work proposes a Uniform Index Caching (UIC) approach where query results are cached in all peers along the inverse query path such that the same query requested by other peers can be replied from their nearby cached results. The DiCAS (Distributed Caching and Adaptive Search) approach [WXLZ04] extends this by distributing the cached results among neighboring peers and queries are forwarded to peers with high probability of providing the desired cache results.

Both UIC and DiCAS protocols cache the final results of a query and reuse them only if a query matches exactly the cached results, i.e., only if the results were cached due to an identical query issued earlier. It should be noted that UIC and DiCAS are developed in the context of unstructured P2P networks and are not directly applicable in DHT-based structured networks. In [BCG+ 03], the authors present a way to efficiently maintain a distributed version of inverted list intersections that were computed frequently to answer queries. Here, the cacheable entity is a materialized intersection of inverted lists which reduces the bandwidth consumption due to frequent shipping of inverted lists. Similar ideas are also proposed recently in [SA06].

- 144 Chapter 8 Conclusion and Open Questions In the context of P2P information filtering, it is not obvious either how or what to cache or replicate. In the setting of the thesis, the most benefits out of replication and caching emerge from applying them to the distributed directory of statistics to ensure directory maintenance for high dynamics. In a combined architecture as MinervaDL, caching strategies as mentioned above should be applied. Other caching and replication strategies need more work in the future.

8.3.4 Economics Finally, an interesting direction for future research covers the economic aspects of the presented approximate information filtering approach. In contrast to existing exact filtering approaches, publishers have to disseminate their data items or documents such that the data is available in the system. In MAPS, publishers are not required to expose their actual content to the rest of the network, but rather they disseminate metadata about it. This new paradigm allows to introduce a payment model for approximate filtering which introduces a new research direction and an interesting business model.

The design of an incentive-based payment model has to consider several cost aspects and restrictions, e.g., the MAPS system has to avoid that subscribers increase the number of monitored publishers above a certain threshold. This can be achieved by introducing a cost for submitting a subscription to a publisher. In the following, the properties of a possible payment model are driven by the following assumptions: (i) subscribers are interested in harvesting information while minimizing their costs, and (ii) publishers are interested in selling information in order to maximize their profit.

An interesting dimension introduced by adding an economic perspective involves the limitation of subscribers to monitor only a few publishers. This limitation makes publisher selection a crucial issue, since subscribers may afford to monitor only a few sources. In the

following, interesting issues introduced by an economic model are elaborated:

• Directory peers have to maintain metadata of publishers and provide this information to subscribers that submit a continuous query. One possible payment model could assume that publishers have to pay for storing metadata and subscribers have to pay for receiving metadata. Thus, directory peers have an incentive to store as much metadata as possible, and they are only restricted by their storage capacity and computational power. This assumption ensures that peers not interested in taking part in the directory maintenance get paid.

Pages:     | 1 |   ...   | 21 | 22 || 24 | 25 |

Similar works:

«Addiction and Treatment SPOKANE COUNTY Alcohol and Drug Coordinator, Dan Finn Spokane County 312 West Eighth Avenue Spokane, WA 99204 (509) 477-4507 FAX: (509) 477-6827 E-Mail: dfinn@spokanecounty.org Prevention Specialist, Alan Zeuge (509) 477-4508 FAX: (509) 477-6827 E-Mail: azeuge@spokanecounty.org STEVENS COUNTY Alcohol and Drug Coordinator, David Nielsen 165 East Hawthorne Avenue Colville, WA 99114 (509) 685-0627 FAX: (509) 684-5286 E-Mail: dmnielsen@co.stevens.wa.us Prevention Specialist,...»

«INTERNATIONAL SCIENTIFIC SEMINAR “NOMADS OF CENTRAL ASIA AND THE SILK ROADS” PAPER ABSTRACT Means of survival means of development; handicrafts and tourism on the Silk Roads in Mongolia today. Tourism is a major process in modern society and culture. Museums and craft production are key elements in this process. After a review of tourism and the role of museums in Mongolia, and representations of Mongolia in Europe, it is argued that craft production should play an important part in the...»

«Make the Switch to Patterson State Bank! Switching from your current bank to Patterson State Bank has never been so easy! ! Our Switch Specialists are ready to assist you every step of the way. ! Seven Simple Switching Steps: ! 1. Open your new Patterson State Bank checking account at one of convenient branch locations OR online at www.pattersonstatebank.com ! 2. Enroll in Online Banking Services: ☐ Online Banking ☐ Electronic Statements ☐ Bill Pay ☐ Mobile Banking ! 3. Stop actively...»

«Marcus Bingenheimer Der Monchsgelehrte Yinshun (* 1906) und seine Bedeutung fUr den Chinesisch-Taiwanischen Buddhismus im 20. Jahrhundert Wurzburger Sinologische Schriften edition forum Für D.B., B.v.C, W.J. und all die anderen Inhaltsverzeichnis I. Hauptteil 1. Einleitung 1.1 Aufbau der Arbeit 13 1.2 Relevanz des Themas 16 1.3 Der besondere Beitrag Yinshuns zum Buddhismus 19 1.4 Stand der Forschung 21 2. Yinshuns Zeit auf dem Festland Die ersten fünfzig Jahre 2.1 Die frühen Jahre 2.1.1...»

«AMUSEMENT, DELIGHT, WHIMSY, AND WIT, THE PLACE OF HUMOUR IN HUMAN CREATIVITY Edith K. Ackermann Abstract People generally respond to humour, i.e., they are inclined to smile at something funny. People also crack jokes and, starting at age two, human infants engage in pretence or fantasy play. Research on creativity, for its part, has either ignored the trickster within, or it gives our fascination with the bizarre a mixed press. The same goes, to a lesser extent, of playfulness. Among...»


«Wie Mitglieder der lokalen Behörden gewählt werden 1. Struktur und Mitgliedschaft in Kommunalbehörden 2. Wählbarkeit 3. Wer darf bei einer Kommunalwahl seine Stimme abgeben? 4. Wählerregister 5. Regelungen für die Stimmabgabe 6. Wann werden Kommunalwahlen abgehalten? 7. Wie wird die Wahl organisiert? 8. Nominierung der Kandidaten 9. Die Abstimmung 10. Stimmabgabe 11. Auszählung 12. Ergebnisse der Wahl 13. Anruf eines Gerichts 14. Vorsitzender/Bürgermeister 15. Plötzliche Vakanzen 16....»

«REVIEW ARTICLE Monumenta Serica 42 (1994): 521-529 THE WHITE LOTUS TEACHINGS HUBERT SEIWERT Review of B[AREND] J. TER HAAR, The White Lotus Teachings in Chinese Religious History. Sinica Leidensia 26. Leiden/New York/Köln: E. J. Brill, 1992„ XIV, 343 pp. 7 maps. Gld. 170. US$ 97.25. ISSN 0169-9563. ISBN 90-04-09414-8 (cloth) During the last two decades the study of Chinese popular religion and sectarianism has becomc a major subject of Western scholars in Chinese religion. Whilc before the l...»

«Concepts for the Analysis of Very Small Samples and Fast Capillary Electrophoresis Coupled to Mass Spectrometry Dissertation zur Erlangung des Doktorgrades der Naturwissenschaften (Dr. rer. nat.) der Fakultät Chemie und Pharmazie der Universität Regensburg vorgelegt von Marco Grundmann aus Karl-Marx-Stadt (jetzt Chemnitz) im Jahr 2012 Diese Dissertation entstand in der Zeit von November 2008 bis Mai 2012 am Institut für analytische Chemie, Chemound Biosensorik der Fakultät Chemie und...»

«The role of abiotic processes in the formation and degradation of gaseous nitrogen compounds in the soil Jannis Heil Member of the Helmholtz Association Energie & Umwelt /  Energy & Environment Band/ Volume 297 ISBN 978-3-95806-106-4 Schriften des Forschungszentrums Jülich Reihe Energie & Umwelt / Energy & Environment Band / Volume 297 Forschungszentrum Jülich GmbH Institute of Bioand Geosciences Agrosphere (IBG-3) The role of abiotic processes in the formation and degradation...»

«Янко Слава || Библиотека Fort/Da || http://yanko.lib.ru || http://yanko.lib.ru/gum.html || slavaaa@yandex.ru || yanko_slava@yahoo.com ШРИФТ «АЛЬФА» необходим для древнегреческого, правда, для трех слов. ☺ Сканирование и форматирование: Янко Слава (Библиотека Fort/Da) || yanko.lib.ru@rambler.ru || yanko_slava@yahoo.com || http://yanko.lib.ru || Icq# 75088656 ||...»

«Testimonies For The Church Volume One Table of Contents Chapter Title Page # Preface The Background of Volume One Biographical Sketch 1. My Childhood 2. My Conversion 3. Feelings of Despair 4. Leaving the Methodist Church 5. Opposition of Formal Brethren 6. Advent Experience 7. My First Vision 8. Call to Travel 9. Vision of the New Earth 10. Withholding Reproof 11. Marriage and Subsequent Labours 12. Publishing and Travelling 13. Removal to Michigan 14. The Death of My Husband Testimony 1...»

<<  HOME   |    CONTACTS
2016 www.abstract.xlibx.info - Free e-library - Abstract, dissertation, book

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.