Aigaion: RACTI / RU1 Technical Report Series (Web Based)

[RACTI-RU1-2005-20] Chatzigiannakis, Ioannis and Nikoletseas, Sotiris, A Forward Planning Situated Protocol for Data Propagation in Wireless Sensor Networks based on Swarm Intelligence Techniques, in: 17th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2005), pages 214, ACM Press, ACM, Las Vegas, NV, USA, 2005.
Abstract: We here present the Forward Planning Situated Protocol (FPSP), for scalable, energy efficient and fault tolerant data propagation in situated wireless sensor networks. To deal with the increased complexity of such deeply networked sensor systems, instead of emphasizing on a particular aspect of the services provided, i.e. either for low-energy periodic, or low-latency event-driven, or high-success query-based sensing, FPSP uses two novel mechanisms that allow the network operator to adjust the performance of the protocol in terms of energy, latency and success rate on a per-task basis. We emphasize on distributedness, direct or indirect interactions among relatively simple agents, flexibility and robustness. The protocol operates by employing a series of plan & forward phases through which devices self-organize into forwarding groups that propagate data over discovered paths. FPSP performs a limited number of long range, high power data transmissions to collect information regarding the neighboring devices. The acquired information, allows to plan a (parameterizable long by {\"e}) sequence of short range, low power transmissions between nearby particles, based on certain optimization criteria. All particles that decide to respond (based on local criteria) to these long range transmissions enter the forwarding phase during which information is propagated via the acquired plan. Clearly, the duration of the forwarding phases is characterized by the parameter {\"e}, the transmission medium and the processing speed of the devices. In fact the parameter {\"e} provides a mechanism to adjust the protocol performance in terms of the latency--energy trade-off. By reducing {\"e} the latency is reduced at the cost of spending extra energy, while by increasing {\"e}, the energy dissipation is reduced but the latency is increased. To control the success rate--energy trade-off, particles react locally on environment and context changes by using a set of rules that are based on response thresholds that relate individual-level plasticity with network-level resiliency, motivated by the nature-inspired method for dividing labor, a metaphor of social insect behavior for solving problems [1]. Each particle has an individual response threshold {\`E} that is related to the "local" density (as observed by the particle, [2]); particles engage in propagation of events when the level of the task-associated stimuli exceeds their thresholds. Let s be the intensity of a stimulus associated with a particular sensing task, set by the human authorities. We adopt the response function T_{\`e}(s) = sⁿover sⁿ + {\`e}ⁿ, the probability of performing the task as a function of s, where n > 1 determines the steepness of the threshold. Thus, when {\`e} is small (i.e. the network is sparse) then the response probability increases; when s increases (i.e. for critical sensing tasks) the response probability increases as well. This role-based approach where a selective number of devices do the high cost planning and the rest of the network operates in a low cost state leads to systems that have increased energy efficiency and high fault-tolerance since these long range planning phases allow to bypass obstacles (where no sensors are available) or faulty sensors (that have been disabled due to power failure or other natural events).
[RACTI-RU1-2007-88] Kirousis, Lefteris and Stratiotis, Thodoris, An Energy-Fair Probabilistic Distributed Communication Protocol on Sensor Networks, 2007.
Abstract: Wireless Sensor Networks are complex systems consisting of a number of relatively simple autonomous sensing devices spread on a geographical area. The peculiarity of these devices lies on the constraints they face in relation to their energy reserves and their computational, storage and communication capabilities. The utility of these sensors is to measure certain environmental conditions and to detect critical events in relation to these measurements. Those events thereupon have to be reported to a specific central station namely the “sink”. This data propagation generally has the form of a hop-by-hop transmission. In this framework we work on distributed data propagation protocols which are taking into account the energy reserves of the sensors. In particular following the work of Chatzigiannakis et al. on the Probabilistic Forwarding Protocol (PFR) we present the distributed probabilistic protocol EFPFR, which favors transmission from the less depleted sensors in addition to favor transmissions close to the “optimal line”. This protocol is simple and relies only on local information for propagation decisions. Its main goal is to limit the total amount of energy dissipated per event and therefore to extend the network’s operation duration.
[RACTI-RU1-2014-6] Michail, Othon, Chatzigiannakis, Ioannis and Spirakis, Paul, Causality, Influence, and Computation in Possibly Disconnected Synchronous Dynamic Networks, in: Journal of Parallel and Distributed Computing (JPDC), volume 74, number 1, pages 2016-2026, 2014. [DOI]
Abstract: In this work, we study the propagation of influence and computation in dynamic distributed computing systems that are possibly disconnected at every instant. We focus on a synchronous message-passing communication model with broadcast and bidirectional links. Our network dynamicity assumption is a worst-case dynamicity controlled by an adversary scheduler, which has received much attention recently. We replace the usual (in worst-case dynamic networks) assumption that the network is connected at every instant by minimal temporal connectivity conditions. Our conditions only require that another causal influence occurs within every time window of some given length. Based on this basic idea, we define several novel metrics for capturing the speed of information spreading in a dynamic network. We present several results that correlate these metrics. Moreover, we investigate termination criteria in networks in which an upper bound on any of these metrics is known. We exploit our termination criteria to provide efficient (and optimal in some cases) protocols that solve the fundamental counting and all-to-all token dissemination (or gossip) problems.
[RACTI-RU1-2015-5] Michail, Othon, Chatzigiannakis, Ioannis and Spirakis, Paul, Computing in Dynamic Networks, in: Computational Network Theory: Theoretical Foundations and Applications, First Edition, Wiley-VCH Verlag GmbH & Co. KGaA, 2015.
Abstract: In this chapter, our focus is on computational network analysis from a theoretical point of view. In particular, we study the \emph{propagation of influence and computation in dynamic distributed computing systems}. We focus on a \emph{synchronous message passing} communication model with bidirectional links. Our network dynamicity assumption is a \emph{worst-case dynamicity} controlled by an adversary scheduler, which has received much attention recently. We first study the fundamental \emph{naming} and \emph{counting} problems (and some variations) in networks that are \emph{anonymous}, \emph{unknown}, and possibly dynamic. Network dynamicity is modeled here by the \emph{1-interval connectivity model}, in which communication is synchronous and a (worst-case) adversary chooses the edges of every round subject to the condition that each instance is connected. We then replace this quite strong assumption by minimal \emph{temporal connectivity} conditions. These conditions only require that \emph{another causal influence occurs within every time-window of some given length}. Based on this basic idea we define several novel metrics for capturing the speed of information spreading in a dynamic network. We present several results that correlate these metrics. Moreover, we investigate \emph{termination criteria} in networks in which an upper bound on any of these metrics is known. We exploit these termination criteria to provide efficient (and optimal in some cases) protocols that solve the fundamental \emph{counting} and \emph{all-to-all token dissemination} (or \emph{gossip}) problems. Finally, we propose another model of worst-case temporal connectivity, called \emph{local communication windows}, that assumes a fixed underlying communication network and restricts the adversary to allow communication between local neighborhoods in every time-window of some fixed length. We prove some basic properties and provide a protocol for counting in this model.
[RACTI-RU1-2006-10] Ntarmos, Nikos, Triantafillou, Peter and Weikum, Gerhard, Counting at large: Efficient cardinality estimation in Internet-scale data networks, in: 22nd International Conference on Data Engineering (ICDE 2006), 2006.
Abstract: Counting in general, and estimating the cardinality of (multi-) sets in particular, is highly desirable for a large variety of applications, representing a foundational block for the efficient deployment and access of emerging internet-scale information systems. Examples of such applications range from optimizing query access plans in internet-scale databases, to evaluating the significance (rank/score) of various data items in information retrieval applications. The key constraints that any acceptable solution must satisfy are: (i) efficiency: the number of nodes that need be contacted for counting purposes must be small in order to enjoy small latency and bandwidth requirements; (ii) scalability, seemingly contradicting the efficiency goal: arbitrarily large numbers of nodes nay need to add elements to a (multi-) set, which dictates the need for a highly distributed solution, avoiding server-based scalability, bottleneck, and availability problems; (iii) access and storage load balancing: counting and related overhead chores should be distributed fairly to the nodes of the network; (iv) accuracy: tunable, robust (in the presence of dynamics and failures) and highly accurate cardinality estimation; (v) simplicity and ease of integration: special, solution-specific indexing structures should be avoided. In this paper, first we contribute a highly-distributed, scalable, efficient, and accurate (multi-) set cardinality estimator. Subsequently, we show how to use our solution to build and maintain histograms, which have been a basic building block for query optimization for centralized databases, facilitating their porting into the realm of internet-scale data networks.
[RACTI-RU1-2009-87] Ntarmos, Nikos, Triantafillou, Peter and Weikum, Gerhard, Distributed Hash Sketches: Scalable, Efficient, and Accurate Cardinality Estimation for Distributed Multisets, in: ACM Transactions on Computer Systems, ACM TOCS, 2009.
Abstract: Counting items in a distributed system, and estimating the cardinality of multisets in particular, is important for a large variety of applications and a fundamental building block for emerging Internet-scale information systems. Examples of such applications range from optimizing query access plans in peer-to-peer data sharing, to computing the significance (rank/score) of data items in distributed information retrieval. The general formal problem addressed in this article is computing the network-wide distinct number of items with some property (e.g., distinct files with file name containing “spiderman”) where each node in the network holds an arbitrary subset, possibly overlapping the subsets of other nodes. The key requirements that a viable approach must satisfy are: (1) scalability towards very large network size, (2) efficiency regarding messaging overhead, (3) load balance of storage and access, (4) accuracy of the cardinality estimation, and (5) simplicity and easy integration in applications. This article contributes the DHS (Distributed Hash Sketches) method for this problem setting: a distributed, scalable, efficient, and accurate multiset cardinality estimator. DHSis based on hash sketches for probabilistic counting, but distributes the bits of each counter across network nodes in a judicious manner based on principles of Distributed Hash Tables, paying careful attention to fast access and aggregation as well as update costs. The article discusses various design choices, exhibiting tunable trade-offs between estimation accuracy, hop-count efficiency, and load distribution fairness. We further contribute a full-fledged, publicly available, open-source implementation of all our methods, and a comprehensive experimental evaluation for various settings.
[RACTI-RU1-2010-12] Dolev, Shlomi, Schiller, Elad Michael, Spirakis, Paul and Tsigas, Ph., Game Authority for Robust and Scalable Distributed Selfish Computer Systems, in: Theoretical Computer Science, pages 2459-2466, 2010.
Abstract: Distributed algorithm designers often assume that system processes execute the same predefined software. Alternatively, when they do not assume that, designers turn to non-cooperative games and seek an outcome that corresponds to a rough consensus when no coordination is allowed. We argue that both assumptions are inapplicable in many real distributed systems, e.g., the Internet, and propose designing self-stabilizing and Byzantine fault-tolerant distributed game authorities. Once established, the game authority can secure the execution of any complete information game. As a result, we reduce costs that are due to the processes� freedom of choice. Namely, we reduce the price of malice.
[RACTI-RU1-2006-15] Michel, Sebastian, Bender, Matthias, Triantafillou, Peter and Weikum, Gerhard, Global Document Frequency Estimation in Peer-to-Peer Web Search, in: 9th International Workshop on the Web and Databases (WebDB 2006), pages 62-67, 2006.
Abstract: Information retrieval (IR) in peer-to-peer (P2P) networks, where the corpus is spread across many loosely coupled peers, has recently gained importance. In contrast to IR systems on a centralized server or server farm, P2P IR faces the additional challenge of either being oblivious to global corpus statistics or having to compute the global measures from local statistics at the individual peers in an efficient, distributed manner. One specific measure of interest is the global document frequency for different terms, which would be very beneficial as term-specific weights in the scoring and ranking of merged search results that have been obtained from different peers. This paper presents an efficient solution for the problem of estimating global document frequencies in a large-scale P2P network with very high dynamics where peers can join and leave the network on short notice. In particular, the developed method takes into account the fact that the lo- cal document collections of autonomous peers may arbitrar- ily overlap, so that global counting needs to be duplicate- insensitive. The method is based on hash sketches as a technique for compact data synopses. Experimental stud- ies demonstrate the estimator?s accuracy, scalability, and ability to cope with high dynamics. Moreover, the benefit for ranking P2P search results is shown by experiments with real-world Web data and queries.
[RACTI-RU1-2006-17] Michel, Sebastian, Bender, Matthias, Triantafillou, Peter and Weikum, Gerhard, IQN Routing: Integrating Quality and Novelty in P2P Querying and Ranking, in: 10th International Conference on Extending Database Technology (EDBT 2006), pages 62-67, 2006.
Abstract: Information retrieval (IR) in peer-to-peer (P2P) networks, where the corpus is spread across many loosely coupled peers, has recently gained importance. In contrast to IR systems on a centralized server or server farm, P2P IR faces the additional challenge of either being oblivious to global corpus statistics or having to compute the global measures from local statistics at the individual peers in an efficient, distributed manner. One specific measure of interest is the global document frequency for different terms, which would be very beneficial as term-specific weights in the scoring and ranking of merged search results that have been obtained from different peers. This paper presents an efficient solution for the problem of estimating global document frequencies in a large-scale P2P network with very high dynamics where peers can join and leave the network on short notice. In particular, the developed method takes into account the fact that the lo- cal document collections of autonomous peers may arbitrar- ily overlap, so that global counting needs to be duplicate- insensitive. The method is based on hash sketches as a technique for compact data synopses. Experimental stud- ies demonstrate the estimator?s accuracy, scalability, and ability to cope with high dynamics. Moreover, the benefit for ranking P2P search results is shown by experiments with real-world Web data and queries.
[RACTI-RU1-2008-9] Kalles, D., Kaporis, Alexis and Spirakis, Paul, Myopic Distributed Protocols for Singleton and Independent-Resource Congestion Games, in: 7th International Workshop on Experimental Algorithms (WEA 2008), pages 181-193, Springer-Verlag Berlin Heidelberg, Massachusetts, USA, 2008.
Abstract: Let n atomic players be routing their unsplitable flow on mresources. When each player has the option to drop her current resource and select a better one, and this option is exercised sequentially and unilaterally, then a Nash Equilibrium (NE) will be eventually reached. Acting sequentially, however, is unrealistic in large systems. But, allowing concurrency, with an arbitrary number of players updating their resources at each time point, leads to an oscillation away from NE, due to big groups of players moving simultaneously and due to nonsmooth resource cost functions. In this work, we validate experimentally simple concurrent protocols that are realistic, distributed and myopic yet are scalable, require only information local at each resource and, still, are experimentally shown to quickly reach a NE for a range of arbitrary cost functions.
[RACTI-RU1-2010-60] Kalochristianakis, M. and Varvarigos, Emmanouel, Open source integrated remote systems and network management with OpenRSM, in: 4th International DMTF Workshop on Systems and Virtualization Management, SVM 2010, 2010.
Abstract: Managing corporate Information Technology (IT) environment becomes increasingly complex as server logic architecture becomes distributed and the number of manageable entities increases. At the same time, the open source community has not yet produced a reliable systems and network management solution, even though there are open source initiatives specializing in individual fields of remote management. This paper presents OpenRSM, an integrated remote management system created by integrating individual open source initiatives and augmenting them to support additional functionality so that a lightweight integrated systems and network management solution is produced.
[RACTI-RU1-2008-47] Karalis, Y, Kalochristianakis, M., Kokkinos, Panagiotis and Varvarigos, Emmanouel, OpenRSM: An Open Source Lightweight Integrated Remote Network and Systems Management Solution, in: International Journal of Network Management, 2008.
Abstract: The management of the corporate information technology (IT) environment is rapidly increasing in complexity as server logic architecture becomes more distributed and the number of entities deployed increases, forcing enterprises to resort to thick, complex and expensive high-end integrated systems and network management solutions. Investing in such systems can be ineffi cient for small and medium corporations, since the vast majority of management tasks performed are routine tasks, while personnel specialization requirements and costs are high. At the same time, the open source community has not yet produced a reliable and complete system and network management solution. Even though there are open source initiatives specializing in specifi c fi elds of remote management, such as network management, there has been no integrated open source solution yet. This paper introduces the Open Source Remote Systems Management (OpenRSM) platform. OpenRSM is an integrated remote management system created by integrating individual specialized open source management initiatives and signifi cantly augmenting them to support additional functionality, so that a complete lightweight system and network management solution is produced. The system implemented facilitates daily management by providing an effi cient, simple and adaptable environment for the majority of management operations. Copyright � 2008 John Wiley & Sons, Ltd.
[RACTI-RU1-2011-5] Dolev, Shlomi, Schiller, Elad Michael, Spirakis, Paul and Tsigas, Ph., Robust and scalable middleware for selfish-computer systems, in: Computer Science Review, volume 5, number 1, 2011. [DOI]
Abstract: Distributed algorithm designers often assume that system processes execute the same predefined software. Alternatively, when they do not assume that, designers turn to non-cooperative games and seek an outcome that corresponds to a rough consensus when no coordination is allowed. We argue that both assumptions are inapplicable in many real distributed systems, e.g., the Internet, and propose designing self-stabilizing and Byzantine fault-tolerant distributed game authorities. Once established, the game authority can secure the execution of any complete information game. As a result, we reduce costs that are due to the processes� freedom of choice. Namely, we reduce the price of malice.
[RACTI-RU1-2008-76] Liagkou, Vasiliki, Secure and Trust Cryptographic Communication Protocols, Department of Computer Engineering and Informatics, 2008.
Abstract: In this Phd thesis,, we try to use formal logic and threshold phenomena that asymptotically emerge with certainty in order to build new trust models and to evaluate the existing one. The departure point of our work is that dynamic, global computing systems are not amenable to a static viewpoint of the trust concept, no matter how this concept is formalized. We believe that trust should be a statistical, asymptotic concept to be studied in the limit as the system's components grow according to some growth rate. Thus, our main goal is to define trust as an emerging system property that ``appears'' or "disappears" when a set of properties hold, asymptotically with probability$ 0$ or $1$ correspondingly . Here we try to combine first and second order logic in order to analyze the trust measures of specific network models. Moreover we can use formal logic in order to determine whether generic reliability trust models provide a method for deriving trust between peers/entities as the network's components grow. Our approach can be used in a wide range of applications, such as monitoring the behavior of peers, providing a measure of trust between them, assessing the level of reliability of peers in a network. Wireless sensor networks are comprised of a vast number of ultra-small autonomous computing, communication and sensing devices, with restricted energy and computing capabilities, that co-operate to accomplish a large sensing task. Sensor networks can be very useful in practice. Such systems should at least guarantee the confidentiality and integrity of the information reported to the controlling authorities regarding the realization of environmental events. Therefore, key establishment is critical for the protection in wireless sensor networks and the prevention of adversaries from attacking the network. Finally in this dissertation we also propose three distributed group key establishment protocols suitable for such energy constrained networks. This dissertation is composed of two parts. Part I develops the theory of the first and second order logic of graphs - their definition, and the analysis of their properties that are expressible in the {\em first order language} of graphs. In part II we introduce some new distributed group key establishment protocols suitable for sensor networks. Several key establishment schemes are derived and their performance is demonstrated.
[RACTI-RU1-2009-88] Ntarmos, Nikos, Triantafillou, Peter and Weikum, Gerhard, Statistical Structures for Internet-Scale Data Management, in: Statistical Structures for Internet-Scale Data Management, 2009.
Abstract: Efficient query processing in traditional database management systems relies on statistics on base data. For centralized systems, there is a rich body of research results on such statistics, from simple aggregates to more elaborate synopses such as sketches and histograms. For Internet-scale distributed systems, on the other hand, statisticsmanagement still poses major challenges. With the work in this paper we aim to endow peer-to-peer data management over structured overlays with the power associated with such statistical information, with emphasis on meeting the scalability challenge. To this end, we first contribute efficient, accurate, and decentralized algorithms that can compute key aggregates such as Count, CountDistinct, Sum, and Average. We show how to construct several types of histograms, such as simple Equi-Width, Average Shifted Equi-Width, and Equi-Depth histograms. We present a full-fledged open-source implementation of these tools for distributed statistical synopses, and report on a comprehensive experimental performance evaluation, evaluating our contributions in terms of efficiency, accuracy, and scalability.