Abstract: The management of Grid resources requires scheduling of both computation and communication tasks at various levels. In this study, we consider the two constituent sub-problems of Grid scheduling, namely: (i) the scheduling of computation tasks to processing resources and (ii) the routing and scheduling of the data movement in a Grid network. Regarding computation tasks, we examine two typical online task scheduling algorithms that employ advance reservations and perform full network simulation experiments to measure their performance when implemented in a centralized or distributed manner. Similarly, for communication tasks, we compare two routing and data scheduling algorithms that are implemented in a centralized or a distributed manner. We examine the effect network propagation delay has on the performance of these algorithms. Our simulation results indicate that a distributed architecture with an exhaustive resource utilization update strategy yields better average end-to-end delay performance than a centralized architecture.
Abstract: Many of the network security protocols employed today utilize symmetric block ciphers (DES, AES and CAST etc). The majority of the symmetric block ciphers implement the crucial substitution operation using look up tables, called substitution boxes. These structures should be highly nonlinear and have bit dispersal, i.e. avalanche, properties in order to render the cipher with resistant to cryptanalysis attempts, such as linear and differential cryptanalysis. Highly secure substitution boxes can be constructed using particular Boolean functions as components that have certain mathematical properties which enhance the robustness of the whole cryptoalgorithm. However, enforcing these properties on SBoxes is a highly computationally intensive task. In this paper, we present a distributedalgorithm and its implementation on a computing cluster that accelerates the construction of secure substitution boxes with good security properties. It is fully parametric since it can employ any class of Boolean functions with algorithmically definable properties and can construct SBoxes of arbitrary sizes. We demonstrate the efficiency of the distributedalgorithm implementation compared to its sequential counterpart, in a number of experiments.
Abstract: Load balancing/sharing is a policy which exploits the communication facility between the servers of a distributed system, by using the exchanging of status information and jobs between any two servers of the system, in order to improve the performance of the whole system. In this work, we propose a new adaptive distributed hierarchical scheme, the Virtual Tree Algorithm (VTA), which creates a virtual binary tree structure over the actual network topology. It uses the Difference-Initiated (DI) technique ([11, 1]) for load balancing/sharing, which needs remote information for the transfer policy, and no additional information for the location policy. We demonstrate here that the introduced virtual construction can keep the exchanged messages to a number favourable to those of the previously known efficient algorithms. To show the above statement and evaluate the performance of our policy, we make use of both analytical and simulation results. By using the simulation model that we developed, we compared our results with one of the most representative and new adaptive, symmetrical, distributed, and efficient algorithms, the Variable Threshold (V THR) algorithm
Abstract: As a result of recent significant technological advances, a new computing and communication environment, Mobile Ad Hoc Networks (MANET), is about to enter the mainstream. A multitude of critical aspects, including mobility, severe limitations and limited reliability, create a new set of crucial issues and trade-offs that must be carefully taken into account in the design of robust and efficient algorithms for these environments. The communication among mobile hosts is one among the many issues that need to be resolved efficiently before MANET becomes a commodity.
In this paper, we propose to discuss the communication problem in MANET as well as present some characteristic techniques for the design, the analysis and the performance evaluation of distributed communication protocols for mobile ad hoc networks. More specifically, we propose to review two different design techniques. While the first type of protocols tries to create and maintain routing paths among the hosts, the second set of protocols uses a randomly moving subset of the hosts that acts as an intermediate pool for receiving and delivering messages. We discuss the main design choices for each approach, along with performance analysis of selected protocols.
Abstract: We discuss some new algorithmic and complexity issues in
coalitional and dynamic/evolutionary games, related to the understand-
ing of modern sel¯sh and Complex networks.
In particular: (a) We examine the achievement of equilibria via natural
distributed and greedy approaches in networks. (b) We present a model
of a coalitional game in order to capture the anarchy cost and complexity
of constructing equilibria in such situations. (c) We propose a stochastic
approach to some kinds of local interactions in networks, that can be
viewed also as extensions of the classical evolutionary game theoretic
setting.
Abstract: We study the problem of energy-balanced data propagation in wireless sensor networks. The energy balance property is crucial for maximizing the time the network is functional, by avoiding early energy depletion of a large portion of sensors. We propose a distributed, adaptive data propagation algorithm that exploits limited, local network density information for achieving energy-balance while at the same time
minimizing energy dissipation.
We investigate both uniform and heterogeneous sensor placement distributions. By a detailed experimental evaluation and comparison with well-known energy-balanced protocols, we show that our density-based protocol improves energy efficiency signicantly while also having better energy balance properties.
Furthermore, we compare the performance of our protocol with a centralized, o-line optimum solution derived by a linear program which maximizes the network lifetime and show that it achieves near-optimal performance for uniform sensor deployments.
Abstract: We investigate random intersection graphs, a combinatorial model that quite accurately abstracts distributed networks with local interactions between nodes blindly sharing critical resources from a limited globally available domain. We study important combinatorial properties (independence and hamiltonicity) of such graphs. These properties relate crucially to algorithmic design for important problems (like secure communication and frequency assignment) in distributed networks characterized by dense, local interactions and resource limitations, such as sensor networks. In particular, we prove that, interestingly, a small constant number of random, resource selections suffices to make the graph hamiltonian and we provide tight evaluations of the independence number of these graphs.
Abstract: This chapter aims at presenting certain important aspects of the design of lightweight, event-driven algorithmic solutions for data dissemination in wireless sensor networks that provide support for reliable, efficient and concurrency-intensive operation. We wish to emphasize that efficient solutions at several levels are needed, e.g.~higher level energy efficient routing protools and lower level power management schemes. Furthermore, it is important to combine such different level methods into integrated protocols and approaches. Such solutions must be simple, distributed and local. Two useful algorithmic design principles are randomization (to trade-off efficiency and fault-tolerance) and adaptation (to adjust to high network dynamics towards improved operation). In particular, we provide a) a brief description of the technical specifications of state-of-the-art sensor devices b) a discussion of possible models used to abstract such networks, emphasizing heterogeneity, c) some representative power management schemes, and d) a presentation of some characteristic protocols for data propagation. Crucial efficiency properties of these schemes and protocols (and their combinations, in some cases) are investigated by both rigorous analysis and performance evaluations through large scale simulations.
Abstract: The problem of communication among mobile nodes is one of the most fundamental problems in ad hoc mobile networks and is at the core of many algorithms, such as for counting the number of nodes, electing a leader, data processing etc. For an exposition of several important problems in ad hoc mobile networks. The work of Chatzigiannakis, Nikoletseas and Spirakis focuses on wireless mobile networks that are subject to highly dynamic structural changes created by mobility, channel fluctuations and device failures. These changes affect topological connectivity, occur with high frequency and may not be predictable in advance. Therefore, the environment where the nodes move (in three-dimensional space with possible obstacles) as well as the motion that the nodes perform are \textit{input} to any distributedalgorithm.
Abstract: In this work, we overview some results concerning communication combinatorial properties in random intersection graphs and uniform random intersection graphs. These properties relate crucially to algorithmic design for important problems (like secure communication and frequency assignment) in distributed networks characterized by dense, local interactions and resource limitations, such as sensor networks. In particular, we present and discuss results concerning the existence of large independent sets of vertices whp in random instances of each of these models. As the main contribution of our paper, we introduce a new, general model, which we denote G(V, χ, f). In this model, V is a set of vertices and χ is a set of m vectors in ℝm. Furthermore, f is a probability distribution over the powerset 2χ of subsets of χ. Every vertex selects a random subset of vectors according to the probability f and two vertices are connected according to a general intersection rule depending on their assigned set of vectors. Apparently, this new general model seems to be able to simulate other known random graph models, by carefully describing its intersection rule.
Abstract: We study the partially eponymous model of distributed computation, which simultaneously
generalizes the anonymous and the eponymous models. In this model, processors have
identities, which are neither necessarily all identical (as in the anonymous model) nor
necessarily unique (as in the eponymous model). In a decision problem formalized as a
relation, processors receive inputs and seek to reach outputs respecting the relation. We
focus on the partially eponymous ring, and we shall consider the computation of circularly
symmetric relations on it. We consider sets of rings where all rings in the set have the same
multiset of identity multiplicities.
We distinguish between solvability and computability: in solvability, processors are
required to always reach outputs respecting the relation; in computability, they must
do so whenever this is possible, and must otherwise report impossibility.
We present a topological characterization of solvability for a relation on a set of rings,
which can be expressed as an efficiently checkable, number-theoretic predicate.
We present a universal distributedalgorithm for computing a relation on a set of
rings; it runs any distributedalgorithm for constructing views, followed by local steps.
We derive, as our main result, a universal upper bound on the message complexity to
compute a relation on a set of rings; this bound demonstrates a graceful degradation
with the Least Minimum Base, a parameter indicating the degree of least possible
eponymity for a set of rings. Thereafter, we identify two cases where a relation can be
computed on a set of rings, with rings of size n, with an efficient number of O .n lg n/
messages.
Abstract: We consider a synchronous distributed system with n processes that communicate through a dynamic network guaranteeing 1-interval connectivity i.e., the network topology graph might change at each interval while keeping the graph connected at any time. The processes belonging to the distributed system are identified through a set of labels L = {l1 , l2 . . . , lk } (with 1 ≤ k < n). In this challenging system model, the paper addresses the following problem: ”counting the number of processes with the same label”. We provide a distributedalgorithm that is able solve the problem based on the notion of energy transfer. Each process owns a fixed energy charge, and tries to discharge itself exchanging, at each round, at most half of its charge with neighbors. The paper also discusses when such counting is possible in the presence of failures. Counting processes with the same label in dynamic networks with homonyms is of great importance because it is as difficult as computing generic aggregating functions.
Abstract: DAP (DistributedAlgorithms Platform) is a generic and homogeneous simulation environment aiming at the implementation, simulation, and testing of distributedalgorithms for wired and wireless networks. In this work, we present its architecture, the most important design decisions, and discuss its distinct features and functionalities. DAP allows the algorithm designer to implement a distributed protocol by creating his own customized environment, and programming in a standard programming language in a style very similar to that of a real-world application. DAP provides a graphical user interface that allows the designer to monitor and control the execution of simulations, visualize algorithms, as well as gather statistics and other information for their experimental analysis and testing.
Abstract: Wireless sensor networks are comprised of a vast number of devices, situated in an area of interest that self organize in a structureless network, in order to monitor/record/measure an environmental variable or phenomenon and subsequently to disseminate the data to the control center.
Here we present research focused on the development, simulation and evaluation of energy efficient algorithms, our basic goal is to minimize the energy consumption. Despite technology advances, the problem of energy use optimization remains valid since current and emerging hardware solutions fail to solve it.
We aim to reduce communication cost, by introducing novel techniques that facilitate the development of new algorithms. We investigated techniques of distributed adaptation of the operations of a protocol by using information available locally on every node, thus through local choices we improve overall performance. We propose techniques for collecting and exploiting limited local knowledge of the network conditions. In an energy efficient manner, we collect additional information which is used to achieve improvements such as forming energy efficient, low latency and fault tolerant paths to route data. We investigate techniques for managing mobility in networks where movement is a characteristic of the control center as well as the sensors. We examine methods for traversing and covering the network field based on probabilistic movement that uses local criteria to favor certain areas.
The algorithms we develop based on these techniques operate a) at low level managing devices, b) on the routing layer and c) network wide, achieving macroscopic behavior through local interactions. The algorithms are applied in network cases that differ in density, node distribution, available energy and also in fundamentally different models, such as under faults, with incremental node deployment and mobile nodes. In all these settings our techniques achieve significant gains, thus distinguishing their value as tools of algorithmic design.
Abstract: When one engineers distributedalgorithms, some special characteristics
arise that are different from conventional (sequential or parallel)
computing paradigms. These characteristics include: the need for either a
scalable real network environment or a platform supporting a simulated
distributed environment; the need to incorporate asynchrony, where arbitrarya
synchrony is hard, if not impossible, to implement; and the generation
of “difficult” input instances which is a particular challenge. In this
work, we identifys ome of the methodological issues required to address
the above characteristics in distributedalgorithm engineering and illustrate
certain approaches to tackle them via case studies. Our discussion
begins byad dressing the need of a simulation environment and how asynchronyis
incorporated when experimenting with distributedalgorithms.
We then proceed bys uggesting two methods for generating difficult input
instances for distributed experiments, namelya game-theoretic one and another
based on simulations of adversarial arguments or lower bound proofs.
We give examples of the experimental analysis of a pursuit-evasion protocol
and of a shared memorypro blem in order to demonstrate these ideas.
We then address a particularlyi nteresting case of conducting experiments
with algorithms for mobile computing and tackle the important issue of
motion of processes in this context. We discuss the two-tier principle as
well as a concurrent random walks approach on an explicit representation
of motions in ad-hoc mobile networks, which allow at least for averagecase
analysis and measurements and may give worst-case inputs in some
cases. Finally, we discuss a useful interplay between theory and practice
that arise in modeling attack propagation in networks.
Abstract: Wireless sensor networks are comprised of a vast number of ultra-small fully autonomous computing, communication and sensing devices, with very restricted energy and computing capabilities, which co-operate to accomplish a large sensing task. Such networks can be very useful in practice in applications that require fine-grain monitoring of physical environment subjected to critical conditions (such as inaccessible terrains or disaster places). Very large numbers of sensor devices can be deployed in areas of interest and use self-organization and collaborative methods to form deeply networked environments. Features including the huge number of sensor devices involved, the severe power, computational and memory limitations, their dense deployment and frequent failures, pose new design and implementation aspects. The efficient and robust realization of such large, highly-dynamic, complex, non-conventional environments is a challenging algorithmic and technological task. In this work we consider certain important aspects of the design, deployment and operation of distributedalgorithms for data propagation in wireless sensor networks and discuss some characteristic protocols, along with an evaluation of their performance.
Abstract: This paper deals with systems of multiple mobile robots each of which observes the positions of the other robots and moves to a new position so that eventually the robots form a circle. In the model we study, the robots are anonymous and oblivious, in the sense that they cannot be distinguished by their appearance and do not have a common x-y coordinate system, while they are unable to remember past actions.
We propose a new distributedalgorithm for circle formation on the plane. We prove that our algorithm is correct and provide an upper bound for its performance. In addition, we conduct an extensive and detailed comparative simulation experimental study with the DK algorithm described in [7]. The results show that our algorithm is very simple and takes considerably less time to execute than algorithm DK.
Abstract: This paper deals with systems of multiple mobile robots each of which observes the positions of the other robots and moves to a new position so that eventually the robots form a circle. In the model we study, the robots are anonymous and oblivious, in the sense that they cannot be distinguished by their appearance and do not have a common x-y coordinate system, while they are unable to remember past actions.
We propose a new distributedalgorithm for circle formation on the plane. We prove that our algorithm is correct and provide an upper bound for its performance. In addition, we conduct an extensive and detailed comparative simulation experimental study with the DK algorithm. The results show that our algorithm is very simple and takes considerably less time to execute than algorithm DK.
Abstract: An ad hoc mobile network is a collection of mobile hosts, with wireless communication capabilities, forming a temporary network without the aid of any established fixed infrastructure. In such networks, topological connectivity is subject to frequent, unpredictable change. Our work focuses on networks with high rate of such changes to connectivity. For such dynamically changing networks we propose protocols which exploit the co-ordinated (by the protocol) motion of a small part of the network. We show that such protocols can be designed to work correctly and efficiently even in the case of arbitrary (but not malicious) movements of the hosts not affected by the protocol. We also propose a methodology for the analysis of the expected behavior of protocols for such networks, based on the assumption that mobile hosts (those whose motion is not guided by the protocol) conduct concurrent random walks in their motion space. In particular, our work examines the fundamental problem of communication and proposes distributedalgorithms for it. We provide rigorous proofs of their correctness, and also give performance analyses by combinatorial tools. Finally, we have evaluated these protocols by experimental means.
Abstract: An ad hoc mobile network is a collection of mobile hosts, with wireless communication capabilities, forming a temporary network without the aid of any established fixed infrastructure. In such networks, topological connectivity is subject to frequent, unpredictable change. Our work focuses on networks with high rate of such changes to connectivity. For such dynamically changing networks we propose protocols which exploit the co-ordinated (by the protocol) motion of a small part of the network. We show that such protocols can be designed to work correctly and efficiently even in the case of arbitrary (but not malicious) movements of the hosts not affected by the protocol. We also propose a methodology for the analysis of the expected behavior of protocols for such networks, based on the assumption that mobile hosts (those whose motion is not guided by the protocol) conduct concurrent random walks in their motion space. In particular, our work examines the fundamental problem of communication and proposes distributedalgorithms for it. We provide rigorous proofs of their correctness, and also give performance analyses by combinatorial tools. Finally, we have evaluated these protocols by experimental means.
Abstract: We exploit the game-theoretic ideas presented in [12] to
study the vertex coloring problem in a distributed setting. The vertices
of the graph are seen as players in a suitably defined strategic game,
where each player has to choose some color, and the payoff of a vertex is
the total number of players that have chosen the same color as its own.
We extend here the results of [12] by showing that, if any subset of nonneighboring
vertices perform a selfish step (i.e., change their colors in order
to increase their payoffs) in parallel, then a (Nash equilibrium) proper
coloring, using a number of colors within several known upper bounds
on the chromatic number, can still be reached in polynomial time. We
also present an implementation of the distributedalgorithm in wireless
networks of tiny devices and evaluate the performance in simulated and
experimental environments. The performance analysis indicates that it
is the first practically implementable distributedalgorithm.
Abstract: Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments. The optimizations can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address three degrees of freedom: 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, 2) computing data-adaptive scan depths for different input sources, and 3) data-adaptive sampling of a small subset of input sources in scenarios with hundreds or thousands of query-relevant network nodes. All optimizations are based on a statistical cost model that utilizes local synopses, e.g., in the form of histograms, efficiently computed convolutions, and estimators based on order statistics. The paper presents comprehensive experiments, with three different real-life datasets and using the ns-2 network simulator for a packet-level simulation of a large Internet-style network.
Abstract: We address the issue of measuring distribution fairness in Internet-scale networks. This problem has several interesting instances encountered in different applications, ranging from assessing the distribution of load between network nodes for load balancing purposes, to measuring node utilization for optimal resource exploitation, and to guiding autonomous decisions of nodes in networks built with market-based economic principles. Although some metrics have been proposed, particularly for assessing load balancing algorithms, they fall short. We first study the appropriateness of various known and previously proposed statistical metrics for measuring distribution fairness. We put forward a number of required characteristics for appropriate metrics. We propose and comparatively study the appropriateness of the Gini coefficient (G) for this task. Our study reveals as most appropriate the metrics of G, the fairness index (FI), and the coefficient of variation (CV) in this order. Second, we develop six distributed sampling algorithms to estimate metrics online efficiently, accurately, and scalably. One of these algorithms (2-PRWS) is based on two effective optimizations of a basic algorithm, and the other two (the sequential sampling algorithm, LBS-HL, and the clustered sampling one, EBSS) are novel, developed especially to estimate G. Third, we show how these metrics, and especially G, can be readily utilized online by higher-level algorithms, which can now know when to best intervene to correct unfair distributions (in particular, load imbalances). We conclude with a comprehensive experimentation which comparatively evaluates both the various proposed estimation algorithms and the three most appropriate metrics (G, CV, andFI). Specifically, the evaluation quantifies the efficiency (in terms of number of the messages and a latency indicator), precision, and accuracy achieved by the proposed algorithms when estimating the competing fairness metrics. The central conclusion is that the proposed metric, G, can be estimated with a small number of messages and latency, regardless of the skew of the underlying distribution.
Abstract: Two important performance parameters of distributed, rate-based flow control algorithms are their locality and convergence complexity. The former is characterized by the amount of global knowledge that is available to their scheduling mechanisms, while the latter is defined as the number of update operations performed on rates of individual sessions until max-min fairness is reached. Optimistic algorithms allow any session to intermediately receive a rate larger than its max-min fair rate; bottleneck algorithms finalize the rate of a session only if it is restricted by a certain, highly congested link of the network. In this work, we present a comprehensive collection of lower and upper bounds on convergence complexity, under varying degrees of locality, for optimistic, bottleneck, rate-based flow control algorithms. Say that an algorithm is oblivious if its scheduling mechanism uses no information of either the session rates or the network topology. We present a novel, combinatorial construction of a capacitated network, which we use to establish a fundamental lower bound of dn 4 + n 2 on the convergence complexity of any oblivious algorithm, where n is the number of sessions laid out on a network, and d, the session dependency, is a measure of topological dependencies among sessions. Moreover, we devise a novel simulation proof to establish that, perhaps surprisingly, the lower bound of dn 4 + n 2 on convergence complexity still holds for any partially oblivious algorithm, in which the scheduling mechanism is allowed to use information about session rates, but is otherwise unaware of network topology. On the positive side, we prove that the lower bounds for oblivious and partially oblivious algorithms are both tight. We do so by presenting optimal oblivious algorithms, which converge after dn 2 + n 2 update operations are performed in the worst case. To complete the picture, we show that linear convergence complexity can indeed be achieved if information about both session rates and network topology is available to schedulers. We present a counterexample, nonoblivious algorithm, which converges within an optimal number of n update operations. Our results imply a surprising convergence complexity collapse of oblivious and partially oblivious algorithms, and a convergence complexity separation between (partially) oblivious and nonoblivious algorithms for optimistic, bottleneck rate-based flow control.
Abstract: Andrews et al. [Automatic method for hiding latency in high bandwidth networks, in: Proceedings of the ACM Symposium on Theory of Computing, 1996, pp. 257–265; Improved methods for hiding latency in high bandwidth networks, in: Proceedings of the Eighth Annual ACM Symposium on Parallel Algorithms and Architectures, 1996, pp. 52–61] introduced a number of techniques for automatically hiding latency when performing simulations of networks with unit delay links on networks with arbitrary unequal delay links. In their work, they assume that processors of the host network are identical in computational power to those of the guest network being simulated. They further assume that the links of the host are able to pipeline messages, i.e., they are able to deliver P packets in time O(P+d) where d is the delay on the link.
In this paper we examine the effect of eliminating one or both of these assumptions. In particular, we provide an efficient simulation of a linear array of homogeneous processors connected by unit-delay links on a linear array of heterogeneous processors connected by links with arbitrary delay. We show that the slowdown achieved by our simulation is optimal. We then consider the case of simulating cliques by cliques; i.e., a clique of heterogeneous processors with arbitrary delay links is used to simulate a clique of homogeneous processors with unit delay links. We reduce the slowdown from the obvious bound of the maximum delay link to the average of the link delays. In the case of the linear array we consider both links with and without pipelining. For the clique simulation the links are not assumed to support pipelining.
The main motivation of our results (as was the case with Andrews et al.) is to mitigate the degradation of performance when executing parallel programs designed for different architectures on a network of workstations (NOW). In such a setting it is unlikely that the links provided by the NOW will support pipelining and it is quite probable the processors will be heterogeneous. Combining our result on clique simulation with well-known techniques for simulating shared memory PRAMs on distributed memory machines provides an effective automatic compilation of a PRAM algorithm on a NOW.
Abstract: We call radiation at a point of a wireless network the total amount of electromagnetic quantity (energy or power density) the point is exposed to. The impact of radiation can be high and we believe it is worth studying and control; towards radiation aware wireless networking we take (for the first time in the study of this aspect) a distributed computing, algorithmic approach. We exemplify this line of research by focusing on sensor networks, studying the minimum radiation path problem of finding the lowest radiation trajectory of a person moving from a source to a destination point in the network region. For this problem, we sketch the main ideas behind a linear program that can provide a tight approximation of the optimal solution, and then we discuss three heuristics that can lead to low radiation paths. We also plan to investigate the impact of diverse node mobility to the heuristics' performance.
Abstract: We present key aspects (hardware, software, topology, networking) of SenseWall, an experimental sensor network test-bed we have created for the implementation and engineering of distributed sensor network algorithms. We then describe how SenseWall has been in particular used to implement two recent state of the art algorithms for energy balanced sensor data propagation. We elaborate on the issues and challenges created by the restrictions and particularities of the experimental test-bed and how we dealt with them. We also carry out a detailed performance evaluation comparing the energy balance protocols to two baseline protocols that include only either single hop or direct data transmissions.
Abstract: We study the problem of fair resource allocation in a simple cooperative multi-agent setting where we have k agents and a set of n objects to be allocated to those agents. Each object is associated with a weight represented by a positive integer or real number. We would like to allocate all objects to the agents so that each object is allocated to only one agent and the weight is distributed fairly. We adopt the fairness index popularized by the networking community as our measure of fairness, and study centralized algorithms for fair resource allocation. Based on the relationship between our problem and number partitioning, we devise a greedy algorithm for fair resource allocation that runs in polynomial time but is not guaranteed to find the optimal solution, and a complete anytime algorithm that finds the optimal solution but runs in exponential time. Then we study the phase transition behavior of the complete algorithm. Finally, we demonstrate that the greedy algorithm actually performs very well and returns almost perfectly fair allocations.
Abstract: In this paper we present the design of a simulator platform called FUSE (Fast Universal Simulator Engine). The term Universal means that the Engine can be adapted easily to different domains and be used for varying simulation needs, although our main target is simulation of distributedalgorithms in distributed computing environments. The Engine is Fast in the sense that the simulation overhead is minimal and very large systems can be simulated. We discuss the architecture and the design decisions that form the basis of these features. We also describe the functionality that is provided to its users (e.g., monitoring, statistics, etc.).
Abstract: Distributedalgorithm designers often assume that system processes execute the same predefined software. Alternatively, when they do not assume that, designers turn to non-cooperative games and seek an outcome that corresponds to a rough consensus when no coordination is allowed. We argue that both assumptions are inapplicable in many real distributed systems, e.g., the Internet, and propose designing self-stabilizing and Byzantine fault-tolerant distributed game authorities. Once established, the game authority can secure the execution of any complete information game. As a result, we reduce costs that are due to the processes¢ freedom of choice. Namely, we reduce the price of malice.
Abstract: In this work we study the implementation of multicost rout-
ing in a distributed way in wireless mobile ad hoc networks.
In contrast to traditional single-cost routing, where each
path is characterized by a scalar, in multicost routing a
vector of cost parameters is assigned to each network link,
from which the cost vectors of candidate paths are calcu-
lated. These parameters are combined in various optimiza-
tion functions, corresponding to different routing algorithms,
for selecting the optimal path. Up until now the performance
of multicost and multi-constrained routing in wireless ad hoc
networks has been evaluated either at a theoretical level or
by assuming that nodes are static and have full knowledge
of the network topology and nodes� state. In the present
paper we assess the performance of multicost routing based
on energy-related parameters in mobile ad hoc networks by
embedding its logic in the Dynamic Source Routing (DSR)
algorithm, which is a well-known fully distributed routing
algorithm. We use simulations to compare the performance
of the multicost-DSR algorithm to that of the original DSR
algorithm and examine their behavior under various node
mobility scenarios. The results confirm that the multicost-
DSR algorithm improves the performance of the network in
comparison to the original DSR algorithm in terms of energy efficiency. The multicost-DSR algorithm enhances the
performance of the network not only by reducing energy
consumption overall in the network, but also by spreading
energy consumption more uniformly across the network, pro
longing the network lifetime and reducing the packet drop
probability. Furthermore the delay suffered by the packets
reaching their destination for the case of the multicost-DSR
algorithm is shown to be lower than in the case of the orig
inal DSR algorithm.
Abstract: With this work we aim to make a three-fold contribution.
We first address the issue of supporting efficiently queries
over string-attributes involving prefix, suffix, containment,
and equality operators in large-scale data networks. Our
first design decision is to employ distributed hash tables
(DHTs) for the data network?s topology, harnessing their
desirable properties. Our next design decision is to derive
DHT-independent solutions, treating DHT as a black box.
Second, we exploit this infrastructure to develop efficient
content based publish/subscribe systems. The main con-
tribution here are algorithms for the efficient processing of
queries (subscriptions) and events (publications). Specifi-
cally, we show that our subscription processing algorithms
require O(logN) messages for a N-node network, and our
event processing algorithms require O(l ? logN) messages
(with l being the average string length).
Third, we develop algorithms for optimizing the proces-
sing of multi-dimensional events, involving several string at-
tributes. Further to our analysis, we provide simulation-
based experiments showing promising performance results
in terms of number of messages, required bandwidth, load
balancing, and response times.
Abstract: In this book chapter we will consider key establishment protocols for wireless sensor networks.
Several protocols have been proposed in the literature for the establishment of a shared group key for wired networks.
The choice of a protocol depends whether the key is established by one of the participants (and then transported to the other(s)) or agreed among the participants, and on the underlying cryptographic mechanisms (symmetric or asymmetric). Clearly, the design of key establishment protocols for sensor networks must deal with different problems and challenges that do not exist in wired networks. To name a few, wireless links are particularly vulnerable to eavesdropping, and that sensor devices can be captured (and the secrets they contain can be compromised); in many upcoming wireless sensor networks, nodes cannot rely on the presence of an online trusted server (whereas most standardized authentication and key establishment protocols do rely on such a server).
In particular, we will consider five distributed group key establishment protocols. Each of these protocols applies a different algorithmic technique that makes it more suitable for (i) static sensor networks, (ii) sensor networks where nodes enter sleep mode (i.e. dynamic, with low rate of updates on the connectivity graph) and (iii) fully dynamic networks where nodes may even be mobile. On the other hand, the common factor for all five protocols is that they can be applied in dynamic groups (where members can be excluded or added) and provide forward and backward secrecy. All these protocols are based on the Diffie-Hellman key exchange algorithm and constitute natural extensions of it in the multiparty case.
Abstract: This paper addresses the efficient processing of
top-k queries in wide-area distributed data
repositories where the index lists for the attribute
values (or text terms) of a query are distributed
across a number of data peers and the
computational costs include network latency,
bandwidth consumption, and local peer work.
We present KLEE, a novel algorithmic
framework for distributed top-k queries,
designed for high performance and flexibility.
KLEE makes a strong case for approximate top-k
algorithms over widely distributed data sources.
It shows how great gains in efficiency can be
enjoyed at low result-quality penalties. Further,
KLEE affords the query-initiating peer the
flexibility to trade-off result quality and expected
performance and to trade-off the number of
communication phases engaged during query
execution versus network bandwidth
performance. We have implemented KLEE and
related algorithms and conducted a
comprehensive performance evaluation. Our
evaluation employed real-world and synthetic
large, web-data collections, and query
benchmarks. Our experimental results show that
KLEE can achieve major performance gains in
terms of network bandwidth, query response
times, and much lighter peer loads, all with small
errors in result precision and other result-quality
measures
Abstract: We address the issue of measuring storage, or query load distribution fairness in peer-to-peer data management systems. Existing metrics may look promising from the point of view of specific peers, while in reality being far from optimal from a global perspective. Thus, first we define the requirements and study the appropriateness of various statistical metrics for measuring load distribution fairness towards these requirements. The metric proposed as most appropriate is the Gini coefficient (G). Second, we develop novel distributed sampling algorithms to compute G on-line, with high precision, efficiently, and scalably. Third, we show how G can readily be utilized on-line by higher-level algorithms which can now know when to best intervene to correct load imbalances. Our analysis and experiments testify for the efficiency and accuracy of these algorithms, permitting the online use of a rich and reliable metric, conveying a global perspective of the distribution.
Abstract: In this paper we examine the problem of searching for some information item in the nodes of a fully
interconnected computer network, where each node contains information relevant to some topic
as well as links to other network nodes that also contain information, not necessarily related to
locally kept information. These links are used to facilitate the Internet users and mobile software
agents that try to locate specific pieces of information. However, the links do not necessarily point
to nodes containing information of interest to the user or relevant to the aims of the mobile agent.
Thus an element of uncertainty is introduced. For example, when an Internet user or some search
agent lands on a particular network node, they see a set of links that point to information that is,
supposedly, relevant to the current search. Therefore, we can assume that a link points to relevant
information with some unknown probability p that, in general, is related to the number of nodes
in the network (intuitively, as the network grows, this probability tends to zero since adding more
nodes to the network renders some extant links less accurate or obsolete). Consequently, since there
is uncertainty as to whether the links contained in a node?s Web page are correct or not, a search
algorithm cannot rely on following the links systematically since it may end up spending too much
time visiting nodes that contain irrelevant information. In this work, we will describe and analyze
a search algorithm that is only allowed to transfer a fixed amount of memory along communication
links as it visits the network nodes. The algorithm is, however, allowed to use one bit of memory at
each node as an ?already visited? flag. In this way the algorithm has its memory distributed to the
network nodes, avoiding overloading the network links as it moves from node to node searching for
the information. We work on fully interconnected networks for simplicity reasons and, moreover,
because according to some recent experimental evidence, such networks can be considered to be a
good approximation of the current structure of the World Wide Web.
Abstract: The promises inherent in users coming together to form data
sharing network communities, bring to the foreground new problems formulated
over such dynamic, ever growing, computing, storage, and networking
infrastructures. A key open challenge is to harness these highly
distributed resources toward the development of an ultra scalable, efficient
search engine. From a technical viewpoint, any acceptable solution
must fully exploit all available resources dictating the removal of any
centralized points of control, which can also readily lead to performance
bottlenecks and reliability/availability problems. Equally importantly,
however, a highly distributed solution can also facilitate pluralism in informing
users about internet content, which is crucial in order to preclude
the formation of information-resource monopolies and the biased visibility
of content from economically-powerful sources. To meet these challenges,
the work described here puts forward MINERVA{\^a}{\"i}¿½{\"i}¿½, a novel search
engine architecture, designed for scalability and efficiency. MINERVA{\^a}{\"i}¿½{\"i}¿½
encompasses a suite of novel algorithms, including algorithms for creating
data networks of interest, placing data on network nodes, load balancing,
top-k algorithms for retrieving data at query time, and replication algorithms
for expediting top-k query processing. We have implemented the
proposed architecture and we report on our extensive experiments with
real-world, web-crawled, and synthetic data and queries, showcasing the
scalability and efficiency traits of MINERVA{\^a}{\"i}¿½{\"i}¿½.
Abstract: Recent rapid developments in micro-electro-mechanical systems
(MEMS), wireless communications and digital electronics have already
led to the development of tiny, low-power, low-cost sensor devices.
Such devices integrate sensing, limited data processing and restricted
communication capabilities.
Each sensor device individually might have small utility, however the
effective distributed co-ordination of large numbers of such devices can
lead to the efficient accomplishment of large sensing tasks. Large numbers
of sensors can be deployed in areas of interest (such as inaccessible
terrains or disaster places) and use self-organization and collaborative
methods to form an ad-hoc network.
We note however that the efficient and robust realization of such large,
highly-dynamic, complex, non-conventional networking environments is
a challenging technological and algorithmic task, because of the unique
characteristics and severe limitations of these devices.
This talk will present and discuss several important aspects of the
design, deployment and operation of sensor networks. In particular, we
provide a brief description of the technical specifications of state-of-theart
sensor, a discussion of possible models used to abstract such networks,
a discussion of some key algorithmic design techniques (like randomization,
adaptation and hybrid schemes), a presentation of representative
protocols for sensor networks, for important problems including data
propagation, collision avoidance and energy balance and an evaluation
of crucial performance properties (correctness, efficiency, fault-tolerance)
of these protocols, both with analytic and simulation means.
Abstract: In this paper we propose an energy-aware broadcast algorithm for wireless networks. Our algorithm is based on the multicost approach and selects the set of nodes that by transmitting implement broadcasting in an optimally energy-efficient way. The energy-related parameters taken into account are the node transmission power and the node residual energy. The algorithm{\^a}€™s complexity however is non-polynomial, and therefore, we propose a relaxation producing a near-optimal solution in polynomial time. We also consider a distributed information exchange scheme that can be coupled with the proposed algorithms and examine the overhead introduced by this integration. Using simulations we show that the proposed algorithms outperform other solutions in the literature in terms of energy efficiency. Moreover, it is shown that the near-optimal algorithm obtains most of the performance benefits of the optimal algorithm at a smaller computational overhead.
Abstract: An ad-hoc mobile network is a collection of mobile hosts, with
wireless communication capabilities, forming a temporary network
without the aid of any established fixed infrastructure.
In such networks, topological connectivity is subject to frequent,
unpredictable change. Our work focuses on networks with high
rate of such changes to connectivity. For such dynamic changing
networks we propose protocols which exploit the co-ordinated
(by the protocol) motion of a small part of the network.
We show that such protocols can be designed to work
correctly and efficiently even in the case of arbitrary (but not
malicious) movements of the hosts not affected by the protocol.
We also propose a methodology for the analysis of the expected
behaviour of protocols for such networks, based on the assumption that mobile hosts (whose motion is not guided by
the protocol) conduct concurrent random walks in their
motion space.
Our work examines some fundamental problems such as pair-wise
communication, election of a leader and counting, and proposes
distributedalgorithms for each of them. We provide their
proofs of correctness, and also give rigorous analysis by
combinatorial tools and also via experiments.
Abstract: We investigate the problem of how to achieve energy balanced data propagation in distributed wireless sensor networks. The energy balance property guarantees that the average per sensor energy dissipation is the same for all sensors in the network, throughout the execution of the data propagation protocol. This property is crucial for prolonging the network lifetime, by avoiding early energy depletion of sensors.
We survey representative solutions from the state of the art. We first present a basic algorithm that in each step probabilistically decides whether to propagate data one-hop towards the final destination (the sink), or to send it directly to the sink. This randomized choice trades-off the (cheap, but slow) one-hop transmissions with the direct transmissions to the sink, which are more expensive but bypass the bottleneck region around the sink and propagate data fast. By a detailed analysis using properties of stochastic processes and recurrence relations we precisely estimate (even in closed form) the probability for each propagation option necessary for energy balance.
The fact (shown by our analysis) that direct (expensive) transmissions to the sink are needed only rarely, shows that our protocol, besides energy balanced, is also energy efficient. We then enhance this basic result by surveying some recent findings including a generalized algorithm and demonstrating the optimality of this two-way probabilistic data propagation, as well as providing formal proofs of the energy optimality of the energy balance property.
Abstract: This paper studies the data gathering problem in wireless networks, where data generated at the nodes has to be collected at a single sink. We investigate the relationship between routing optimality and fair resource management. In particular, we prove that for energy balanced data propagation, Pareto optimal routing and flow maximization are equivalent, and also prove that flow maximization is equivalent to maximizing the network lifetime. We algebraically characterize the network structures in which energy balanced data flows are maximal. Moreover, we algebraically characterize communication links which are not used by an optimal flow. This leads to the characterization of minimal network structures supporting the maximal flows.
We note that energy balance, although implying global optimality, is a local property that can be computed efficiently and in a distributed manner. We suggest online distributedalgorithms for energy balance in different optimal network structures and numerically show their stability in particular setting. We remark that although the results obtained in this paper have a direct consequence in energy saving for wireless networks they do not limit themselves to this type of networks neither to energy as a resource. As a matter of fact, the results are much more general and can be used for any type of network and different type of resources.
Abstract: This paper studies the data gathering problem in wireless networks, where data generated at the nodes has to be collected at a single sink. We investigate the relationship between routing optimality and fair resource management. In particular, we prove that for energy-balanced data propagation, Pareto optimal routing and flow maximization are equivalent, and also prove that flow maximization is equivalent to maximizing the network lifetime. We algebraically characterize the network structures in which energy-balanced data flows are maximal. Moreover, we algebraically characterize communication links which are not used by an optimal flow. This leads to the characterization of minimal network structures supporting the maximal flows.
We note that energy-balance, although implying global optimality, is a local property that can be computed efficiently and in a distributed manner. We suggest online distributedalgorithms for energy-balance in different optimal network structures and numerically show their stability in particular setting. We remark that although the results obtained in this paper have a direct consequence in energy saving for wireless networks they do not limit themselves to this type of networks neither to energy as a resource. As a matter of fact, the results are much more general and can be used for any type of network and different types of resources.
Abstract: Distributedalgorithm designers often assume that system processes execute the same predefined software. Alternatively, when they do not assume that, designers turn to non-cooperative games and seek an outcome that corresponds to a rough consensus when no coordination is allowed. We argue that both assumptions are inapplicable in many real distributed systems, e.g., the Internet, and propose designing self-stabilizing and Byzantine fault-tolerant distributed game authorities. Once established, the game authority can secure the execution of any complete information game. As a result, we reduce costs that are due to the processes¢ freedom of choice. Namely, we reduce the price of malice.
Abstract: In this work we tackle the open problem of self-join size (SJS) estimation in a large-scale distributed data system, where tuples of a relation are distributed over data nodes which comprise an overlay network. Our contributions include adaptations of five well-known SJS estimation centralized techniques (coined sequential, cross-sampling, adaptive, bifocal, and sample-count) to the network environment and a novel technique which is based on the use of the Gini coefficient. We develop analyses showing how Gini estimations can lead to estimations of the underlying Zipfian or power-law value distributions. We further contribute distributed sampling algorithms that can estimate accurately and efficiently the Gini coefficient. Finally, we provide detailed experimental evidence testifying for the claimed increased accuracy, precision, and efficiency of the proposed SJS estimation method, compared to the other methods. The proposed approach is the only one to ensure high efficiency, precision, and accuracy regardless of the skew of the underlying data.
Abstract: In this work, we study protocols (i.e. distributedalgorithms) so that populations of distributed processes can construct networks. In order to highlight the basic principles of distributed network construction we keep the model minimal in all respects. In particular, we assume finite-state processes that all begin from the same initial state and all execute the same protocol (i.e. the system is homogeneous). Moreover, we assume pairwise interactions between the processes that are scheduled by an adversary. The only constraint on the adversary scheduler is that it must be fair, intuitively meaning that it must assign to every reachable configuration of the system a non-zero probability to occur. In order to allow processes to construct networks, we let them activate and deactivate their pairwise connections. When two processes interact, the protocol takes as input the states of the processes and the state of their connection and updates all of them. In particular, in every interaction, the protocol may activate an inactive connection, deactivate an active one, or leave the state of a connection unchanged. Initially all connections are inactive and the goal is for the processes, after interacting and activating/deactivating connections for a while, to end up with a desired stable network (i.e. one that does not change any more). We give protocols (optimal in some cases) and lower bounds for several basic network construction problems such as spanning line, spanning ring, spanning star, and regular network. We provide proofs of correctness for all of our protocols and analyze the expected time to convergence of most of them under a uniform random scheduler that selects the next pair of interacting processes uniformly at random from all such pairs. Finally, we prove several universality results by presenting generic protocols that are capable of simulating a Turing Machine (TM) and exploiting it in order to construct a large class of networks. Our universality protocols use a subset of the population (waste) in order to distributedly construct there a TM able to decide a graph class in some given space. Then, the protocols repeatedly construct in the rest of the population (useful space) a graph equiprobably drawn from all possible graphs. The TM works on this and accepts if the presented graph is in the class. We additionally show how to partition the population into k supernodes, each being a line of log k nodes, for the largest such k. This amount of local memory is sufficient for the supernodes to obtain unique names and exploit their names and their memory to realize nontrivial constructions. Delicate composition and reinitialization issues have to be solved for these general constructions to work.
Abstract: Efficient query processing in traditional database
management systems relies on statistics on base data. For centralized systems, there is a rich body of research results on such statistics, from simple aggregates to more elaborate synopses such as sketches and histograms. For Internet-scale distributed systems, on the other hand, statisticsmanagement still poses major challenges. With the work in this paper we aim to endow peer-to-peer data management over structured
overlays with the power associated with such statistical information, with emphasis on meeting the scalability challenge.
To this end, we first contribute efficient, accurate, and decentralized algorithms that can compute key aggregates such as Count, CountDistinct, Sum, and Average. We show how to construct several types of histograms, such as simple Equi-Width, Average Shifted Equi-Width, and Equi-Depth histograms. We present a full-fledged open-source implementation
of these tools for distributed statistical synopses,
and report on a comprehensive experimental performance evaluation, evaluating our contributions in terms of efficiency, accuracy, and scalability.
Abstract: We consider the important problem of energy balanced data propagation in wireless sensor networks and we extend and generalize
previous works by allowing adaptive energy assignment. We consider the data gathering problem where data are generated by the sensors and
must be routed toward a unique sink. Sensors route data by either sending the data directly to the sink or in a multi-hop fashion by delivering
the data to a neighbouring sensor. Direct and neighbouring transmissions require different levels of energy consumption. Basically, the protocols balance the energy consumption among the sensors by computing the adequate ratios of direct and neighbouring transmissions. An abstract model of energy dissipation as a random walk is proposed, along with rigorous performance analysis techniques. Two efficient distributedalgorithms are presented and analysed, by both rigorous means and simulation.
The first one is easy to implement and fast to execute. The protocol assumes that sensors know a-priori the rate of data they generate.
The sink collects and processes all these information in order to compute the relevant value of the protocol parameter. This value is transmitted
to the sensors which individually compute their optimal ratios of direct and neighbouring transmissions. The second protocol avoids the necessary a-priori knowledge of the data rate generated by sensors by inferring the relevant information from the observation of the data paths.
Furthermore, this algorithm is based on stochastic estimation methods and is adaptive to environmental changes.
Abstract: In emerging pervasive scenarios, data is collected by sensing devices in streams that occur at several distributed points of observation. The size of the data typically far exceeds the storage and computational capabilities of the tiny devices that have to collect and process them. A general and challenging task is to allow (some of) the nodes of a pervasive network to collectively perform monitoring of a neighbourhood of interest by issuing continuous aggregate queries on the streams observed in its vicinity. This class of algorithms is fully decentralized and diffusive in nature: collecting all the data at a few central nodes of the network is unfeasible in networks of low capability devices or in the presence of massive data sets. Two main problems arise in this scenario: (i) the intrinsic complexity of maintaining statistics over a data stream whose size greatly exceeds the capabilities of the device that performs the computation; (ii) composing the partial outcomes computed at different points of observation into an accurate, global statistic over a neighbourhood of interest, which entails coping with several problems, last but not least the receipt of duplicate information along multiple paths of diffusion.
Streaming techniques have emerged as powerful tools to achieve the general goals described above, in the first place because they assume a computational model in which computational and storage resources are assumed to be far exceeded by the amount of data on which computation occurs. In this contribution, we review the main streaming techniques and provide a classification of the computational problems and the applications they effectively address, with an emphasis on decentralized scenarios, which are of particular interest in pervasive networks
Abstract: We consider a security problem on a distributed network.
We assume a network whose nodes are vulnerable to infection
by threats (e.g. viruses), the attackers. A system security
software, the defender, is available in the system. However,
due to the network¢s size, economic and performance reasons,
it is capable to provide safety, i.e. clean nodes from
the possible presence of attackers, only to a limited part of
it. The objective of the defender is to place itself in such a
way as to maximize the number of attackers caught, while
each attacker aims not to be caught.
In [7], a basic case of this problem was modeled as a
non-cooperative game, called the Edge model. There, the
defender could protect a single link of the network. Here,
we consider a more general case of the problem where the
defender is able to scan and protect a set of k links of the
network, which we call the Tuple model. It is natural to expect
that this increased power of the defender should result
in a better quality of protection for the network. Ideally,
this would be achieved at little expense on the existence and
complexity of Nash equilibria (profiles where no entity can
improve its local objective unilaterally by switching placements
on the network).
In this paper we study pure and mixed Nash equilibria
in the model. In particular, we propose algorithms for computing
such equilibria in polynomial time and we provide a
polynomial-time transformation of a special class of Nash
equilibria, called matching equilibria, between the Edge
model and the Tuple model, and vice versa. Finally, we
establish that the increased power of the defender results in
higher-quality protection of the network.
Abstract: In this paper we describe a new simulation platform for complex wireless sensor networks that operate a collection of distributedalgorithms and network protocols. Simulating such systems is complicated because of the need to coordinate different network layers and debug protocol stacks, often with very different interfaces, options, and fidelities. Our platform (which we call WSNGE) is a flexible and extensible environment that provides a highly scalable simulator with unique characteristics. It focuses on user friendliness, providing every function in both scriptable and visual way, allowing the researcher to define simulations and view results in an easy to use graphical environment. Unlike other solutions, WSNGE does not distinguish between different scenario types, allowing multiple different protocols to run at the same time. It enables rich online interaction with running simulations, allowing parameters, topologies or the whole scenario to be altered at any point in time.