Abstract: The efficient use of resources and the lossless transfer of data bursts in future optical
networks requires the accurate knowledge of the available bandwidth for each network
link. Such information is important in monitoring congestions and can be used by
appropriate loadbalancing and congestion avoidance mechanisms. In this paper we
propose a mechanism for monitoring and subsequently managing bandwidth resources,
using the Simple NetworkManagement Protocol (SNMP). In the proposed mechanism,
link bandwidth availability is not a scalar parameter, but a function of time that records
the future utilization of the link. For every output port, each agent-node maintains a
simple data structure in the form of a table that records the utilization profile of that
outgoing link. With the addition of new objects in the Management Information Base
(MIB) of each agent-node and proper synchronization, SNMP can be used to update
and retrieve the reservations made on the links in order to obtain an instant picture of
the network traffic situation.
Abstract: In this work we propose a new unified PON
RAN architecture for LTE mobile backhaul networks,
employing ringbased WDM PONs. The proposed
architecture supports dynamic setup of virtual circ
uits for interbase station communication, over a dedicated
λ
LAN channel. The reservation mechanism is arbitrated by the OLT, which also monitors the traffic imbalances of downstream channels. The proposed architecture also supports loadbalancing, by dynamically reallocatin
g and sharing the capacity of the downstream wavelengths.
Abstract: Loadbalancing/sharing is a policy which exploits the communication facility between the servers of a distributed system, by using the exchanging of status information and jobs between any two servers of the system, in order to improve the performance of the whole system. In this work, we propose a new adaptive distributed hierarchical scheme, the Virtual Tree Algorithm (VTA), which creates a virtual binary tree structure over the actual network topology. It uses the Difference-Initiated (DI) technique ([11, 1]) for loadbalancing/sharing, which needs remote information for the transfer policy, and no additional information for the location policy. We demonstrate here that the introduced virtual construction can keep the exchanged messages to a number favourable to those of the previously known efficient algorithms. To show the above statement and evaluate the performance of our policy, we make use of both analytical and simulation results. By using the simulation model that we developed, we compared our results with one of the most representative and new adaptive, symmetrical, distributed, and efficient algorithms, the Variable Threshold (V THR) algorithm
Abstract: The efficient use of resources and the lossless transfer of data bursts in future optical
networks requires the accurate knowledge of the available bandwidth for each network
link. Such information is important in monitoring congestions and can be used by
appropriate loadbalancing and congestion avoidance mechanisms. In this paper we
propose a mechanism for monitoring and subsequently managing bandwidth resources,
using the Simple NetworkManagement Protocol (SNMP). In the proposed mechanism,
link bandwidth availability is not a scalar parameter, but a function of time that records
the future utilization of the link. For every output port, each agent-node maintains a
simple data structure in the form of a table that records the utilization profile of that
outgoing link. With the addition of new objects in the Management Information Base
(MIB) of each agent-node and proper synchronization, SNMP can be used to update
and retrieve the reservations made on the links in order to obtain an instant picture of
the network traffic situation.
Abstract: We consider selfish routing over a network consisting of m parallellinks through which $n$ selfish users route their traffic trying tominimize their own expected latency. We study the class of mixedstrategies in which the expected latency through each link is at mosta constant multiple of the optimum maximum latency had globalregulation been available. For the case of uniform links it is knownthat all Nash equilibria belong to this class of strategies. We areinterested in bounding the coordination ratio (or price of anarchy) ofthese strategies defined as the worst-case ratio of the maximum (overall links) expected latency over the optimum maximum latency. The loadbalancing aspect of the problem immediately implies a lower boundO(ln m ln ln m) of the coordinationratio. We give a tight (up to a multiplicative constant) upper bound.To show the upper bound, we analyze a variant of the classical ballsand bins problem, in which balls with arbitrary weights are placedinto bins according to arbitrary probability distributions. At theheart of our approach is a new probabilistic tool that we call ballfusion; this tool is used to reduce the variant of the problem whereballs bear weights to the classical version (with no weights). Ballfusion applies to more general settings such as links with arbitrarycapacities and other latency functions.
Abstract: We study the problem of scheduling permanent jobs on un-
related machines when the objective is to minimize the Lp
norm of the machine loads. The problem is known as loadbalancing under the Lp norm. We present an improved up-
per bound for the greedy algorithm through simple analy-
sis; this bound is also shown to be best possible within the
class of deterministic online algorithms for the problem. We
also address the question whether randomization helps on-
line loadbalancing under Lp norms on unrelated machines;
this is a challenging question which is open for more than a
decade even for the L2 norm. We provide a positive answer
to this question by presenting the ¯rst randomized online
algorithms which outperform deterministic ones under any
(integral) Lp norm for p = 2; :::; 137. Our algorithms es-
sentially compute in an online manner a fractional solution
to the problem and use the fractional values to make ran-
dom choices. The local optimization criterion used at each
step is novel and rather counterintuitive: the values of the
fractional variables for each job correspond to °ows at an ap-
proximate Wardrop equilibrium for an appropriately de¯ned
non-atomic congestion game. As corollaries of our analysis
and by exploiting the relation between the Lp norm and the
makespan of machine loads, we obtain new competitive algo-
rithms for online makespan minimization, making progress
in another longstanding open problem.
Abstract: We study the problem of scheduling permanent jobs on un-
related machines when the objective is to minimize the Lp
norm of the machine loads. The problem is known as loadbalancing under the Lp norm. We present an improved up-
per bound for the greedy algorithm through simple analy-
sis; this bound is also shown to be best possible within the
class of deterministic online algorithms for the problem. We
also address the question whether randomization helps on-
line loadbalancing under Lp norms on unrelated machines;
this is a challenging question which is open for more than a
decade even for the L2 norm. We provide a positive answer
to this question by presenting the ¯rst randomized online
algorithms which outperform deterministic ones under any
(integral) Lp norm for p = 2; :::; 137. Our algorithms es-
sentially compute in an online manner a fractional solution
to the problem and use the fractional values to make ran-
dom choices. The local optimization criterion used at each
step is novel and rather counterintuitive: the values of the
fractional variables for each job correspond to °ows at an ap-
proximate Wardrop equilibrium for an appropriately de¯ned
non-atomic congestion game. As corollaries of our analysis
and by exploiting the relation between the Lp norm and the
makespan of machine loads, we obtain new competitive algo-
rithms for online makespan minimization, making progress
in another longstanding open problem.
Abstract: Counting in general, and estimating the cardinality of (multi-) sets in particular, is highly desirable for a large variety of applications, representing a foundational block for the efficient deployment and access of emerging internet-scale information systems. Examples of such applications range from optimizing query access plans in internet-scale databases, to evaluating the significance (rank/score) of various data items in information retrieval applications. The key constraints that any acceptable solution must satisfy are: (i) efficiency: the number of nodes that need be contacted for counting purposes must be small in order to enjoy small latency and bandwidth requirements; (ii) scalability, seemingly contradicting the efficiency goal: arbitrarily large numbers of nodes nay need to add elements to a (multi-) set, which dictates the need for a highly distributed solution, avoiding server-based scalability, bottleneck, and availability problems; (iii) access and storage loadbalancing: counting and related overhead chores should be distributed fairly to the nodes of the network; (iv) accuracy: tunable, robust (in the presence of dynamics and failures) and highly accurate cardinality estimation; (v) simplicity and ease of integration: special, solution-specific indexing structures should be avoided. In this paper, first we contribute a highly-distributed, scalable, efficient, and accurate (multi-) set cardinality estimator. Subsequently, we show how to use our solution to build and maintain histograms, which have been a basic building block for query optimization for centralized databases, facilitating their porting into the realm of internet-scale data networks.
Abstract: We present a new overlay, called the Deterministic Decentral-
ized tree (D2-tree). The D2-tree compares favourably to other overlays for
the following reasons: (a) it provides matching and better complexities,
which are deterministic for the supported operations; (b) the manage-
ment of nodes (peers) and elements are completely decoupled from each
other; and (c) an e±cient deterministic load-balancing mechanism is pre-
sented for the uniform distribution of elements into nodes, while at the
same time probabilistic optimal bounds are provided for the congestion
of operations at the nodes.
Abstract: We address the issue of measuring distribution fairness in Internet-scale networks. This problem has several interesting instances encountered in different applications, ranging from assessing the distribution of load between network nodes for loadbalancing purposes, to measuring node utilization for optimal resource exploitation, and to guiding autonomous decisions of nodes in networks built with market-based economic principles. Although some metrics have been proposed, particularly for assessing loadbalancing algorithms, they fall short. We first study the appropriateness of various known and previously proposed statistical metrics for measuring distribution fairness. We put forward a number of required characteristics for appropriate metrics. We propose and comparatively study the appropriateness of the Gini coefficient (G) for this task. Our study reveals as most appropriate the metrics of G, the fairness index (FI), and the coefficient of variation (CV) in this order. Second, we develop six distributed sampling algorithms to estimate metrics online efficiently, accurately, and scalably. One of these algorithms (2-PRWS) is based on two effective optimizations of a basic algorithm, and the other two (the sequential sampling algorithm, LBS-HL, and the clustered sampling one, EBSS) are novel, developed especially to estimate G. Third, we show how these metrics, and especially G, can be readily utilized online by higher-level algorithms, which can now know when to best intervene to correct unfair distributions (in particular, load imbalances). We conclude with a comprehensive experimentation which comparatively evaluates both the various proposed estimation algorithms and the three most appropriate metrics (G, CV, andFI). Specifically, the evaluation quantifies the efficiency (in terms of number of the messages and a latency indicator), precision, and accuracy achieved by the proposed algorithms when estimating the competing fairness metrics. The central conclusion is that the proposed metric, G, can be estimated with a small number of messages and latency, regardless of the skew of the underlying distribution.
Abstract: We present a new overlay, called the
Deterministic Decentralized tree
(
D
2
-
tree). The
D
2
-tree compares favorably to other overlays for the following reasons: (a)
it provides matching and better complexities,which are deterministic for the supported
operations; (b) the management of nodes (peers) and elements are completely decoupled from each other; and (c) an efficient deterministic load-balancing mechanism
is presented for the uniform distribution of elements into nodes, while at the same
time probabilistic optimal bounds are provided for the congestion of operations at the
nodes. The load-balancing scheme of elements into nodes is deterministic and general
enough to be applied to other hierarchical tree-based overlays. This load-balancing
mechanism is based on an innovative lazy weight-balancing mechanism, which is
interesting in its own right.
Abstract: We present a new overlay, called the Deterministic Decentralized tree ( TeX-tree). The TeX-tree compares favorably to other overlays for the following reasons: (a) it provides matching and better complexities, which are deterministic for the supported operations; (b) the management of nodes (peers) and elements are completely decoupled from each other; and (c) an efficient deterministic load-balancing mechanism is presented for the uniform distribution of elements into nodes, while at the same time probabilistic optimal bounds are provided for the congestion of operations at the nodes. The load-balancing scheme of elements into nodes is deterministic and general enough to be applied to other hierarchical tree-based overlays. This load-balancing mechanism is based on an innovative lazy weight-balancing mechanism, which is interesting in its own right.
Abstract: We investigate the impact of multiple, mobile sinks on
efficient data collection in wireless sensor networks. To
improve performance, our protocol design focuses on minimizing
overlaps of sink trajectories and balancing the service load
among the sinks. To cope with high network dynamics, placement
irregularities and limited network knowledge we propose three different
protocols: a) a centralized one, that explicitly equalizes spatial coverage;
this protocol assumes strong modeling assumptions, and also serves as a kind
of performance lower bound in uniform networks of low dynamics b)
a distributed protocol based on mutual avoidance of sinks c) a clustering
protocol that distributively groups service areas towards balancing the load per sink.
Our simulation findings demonstrate significant gains in latency, while keeping the success
rate and the energy dissipation at very satisfactory levels even under
high network dynamics and deployment heterogeneity.
Abstract: In loadbalancing games, there is a set of available servers and
a set of clients; each client wishes to run her job on some server. Clients
are sel¯sh and each of them selects a server that, given an assignment
of the other clients to servers, minimizes the latency she experiences
with no regard to the global optimum. In order to mitigate the e®ect of
sel¯shness on the e±ciency, we assign taxes to the servers. In this way,
we obtain a new game where each client aims to minimize the sum of
the latency she experiences and the tax she pays. Our objective is to
¯nd taxes so that the worst equilibrium of the new game is as e±cient
as possible. We present new results concerning the impact of taxes on
the e±ciency of equilibria, with respect to the total latency of all clients
and the maximum latency (makespan).
Abstract: With this work we aim to make a three-fold contribution.
We first address the issue of supporting efficiently queries
over string-attributes involving prefix, suffix, containment,
and equality operators in large-scale data networks. Our
first design decision is to employ distributed hash tables
(DHTs) for the data network?s topology, harnessing their
desirable properties. Our next design decision is to derive
DHT-independent solutions, treating DHT as a black box.
Second, we exploit this infrastructure to develop efficient
content based publish/subscribe systems. The main con-
tribution here are algorithms for the efficient processing of
queries (subscriptions) and events (publications). Specifi-
cally, we show that our subscription processing algorithms
require O(logN) messages for a N-node network, and our
event processing algorithms require O(l ? logN) messages
(with l being the average string length).
Third, we develop algorithms for optimizing the proces-
sing of multi-dimensional events, involving several string at-
tributes. Further to our analysis, we provide simulation-
based experiments showing promising performance results
in terms of number of messages, required bandwidth, loadbalancing, and response times.
Abstract: The promises inherent in users coming together to form data
sharing network communities, bring to the foreground new problems formulated
over such dynamic, ever growing, computing, storage, and networking
infrastructures. A key open challenge is to harness these highly
distributed resources toward the development of an ultra scalable, efficient
search engine. From a technical viewpoint, any acceptable solution
must fully exploit all available resources dictating the removal of any
centralized points of control, which can also readily lead to performance
bottlenecks and reliability/availability problems. Equally importantly,
however, a highly distributed solution can also facilitate pluralism in informing
users about internet content, which is crucial in order to preclude
the formation of information-resource monopolies and the biased visibility
of content from economically-powerful sources. To meet these challenges,
the work described here puts forward MINERVA{\^a}{\"i}¿½{\"i}¿½, a novel search
engine architecture, designed for scalability and efficiency. MINERVA{\^a}{\"i}¿½{\"i}¿½
encompasses a suite of novel algorithms, including algorithms for creating
data networks of interest, placing data on network nodes, loadbalancing,
top-k algorithms for retrieving data at query time, and replication algorithms
for expediting top-k query processing. We have implemented the
proposed architecture and we report on our extensive experiments with
real-world, web-crawled, and synthetic data and queries, showcasing the
scalability and efficiency traits of MINERVA{\^a}{\"i}¿½{\"i}¿½.
Abstract: Recent rapid technological developments have led to the
development of tiny, low-power, low-cost sensors. Such devices
integrate sensing, limited data processing and communication
capabilities.The effective distributed collaboration
of large numbers of such devices can lead to the efficient
accomplishment of large sensing tasks.
This talk focuses on several aspects of energy efficiency.
Two protocols for data propagation are studied: the first
creates probabilistically optimized redundant data transmissions
to combine energy efficiency with fault tolerance,
while the second guarantees (in a probabilistic way) the
same per sensor energy dissipation, towards balancing the
energy load and prolong the lifetime of the network.
A third protocol (in fact a power saving scheme) is also
presented, that directly and adaptively affects power dissipation
at each sensor. This “lower level” scheme can be
combined with data propagation protocols to further improve
energy efficiency.
Abstract: In this work we focus on the energy efficiency challenge in wireless sensor networks, from both an on-line perspective (related to routing), as well as a network design perspective (related to tracking). We investigate a few representative, important aspects of energy efficiency: a) the robust and fast data propagation b) the problem of balancing the energy
dissipation among all sensors in the network and c) the problem of efficiently tracking moving
entities in sensor networks. Our work here is a methodological survey of selected results that
have alre dy appeared in the related literature.
In particular, we investigate important issues of energy optimization, like minimizing the total
energy dissipation, minimizing the number of transmissions as well as balancing the energy
load to prolong the system¢s lifetime. We review characteristic protocols and techniques in the recent literature, including probabilistic forwarding and local optimization methods. We study the problem of localizing and tracking multiple moving targets from a network design perspective i.e. towards estimating the least possible number of sensors, their positions and operation characteristics needed to efficiently perform the tracking task. To avoid an expensive massive deployment, we try to take advantage of possible coverage overlaps over space and time, by introducing a novel combinatorial model that captures such overlaps. Under this model, we abstract the tracking network design problem by a covering combinatorial problem and then design and analyze an efficient approximate method for sensor placement
and operation.
Abstract: The content-based publish/subscribe (pub/sub)paradigm for system design is becoming increasingly popular, offering unique benefits for a large number of data-intensive applications. Coupled with the peer-to-peer technology, it can serve as a central building block for such applications
deployed over a large-scale network infrastructure. A key problem toward the creation of large-scale contentbased pub/sub infrastructures relates to dealing efficiently with continuous queries (subscriptions) with rich predicates on string attributes; In particular, efficiently and accurately
matching substring queries to incoming events is an open problem. In this work we study this problem. We provide and analyze novel algorithms for processing subscriptions with substring predicates and events in a variety of environments. We provide experimental data demonstrating the
relative performance behavior of the proposed algorithms using as key metrics the network bandwidth requirements, number of messages, loadbalancing, as well as requirements for extra routing state (and related maintenance) and design flexibility.
Abstract: We consider the conflicting problems of ensuring data-access loadbalancing and efficiently processing range queries on peer-to-peer data networks maintained over Distributed Hash Tables (DHTs). Placing consecutive data values in neighboring peers is frequently used in DHTs since it accelerates range query processing. However, such a placement is highly susceptible to load imbalances, which are preferably handled by replicating data (since replication also introduces fault tolerance benefits). In this paper, we present HotRoD, a DHT-based architecture that deals effectively with this combined problem through the use of a novel locality-preserving hash function, and a tunable data replication mechanism which allows trading off replication costs for fair load distribution. Our detailed experimentation study shows strong gains in both range query processing efficiency and data-access loadbalancing, with low replication overhead. To our knowledge, this is the first work that concurrently addresses the two conflicting problems using data replication.
Abstract: We consider the conflicting problems of ensuring data-access loadbalancing and efficiently processing range queries on peer-to-peer data networks maintained over Distributed Hash Tables (DHTs). Placing consecutive data values in neighboring peers is frequently used in DHTs since it accelerates range query processing. However, such a placement is highly susceptible to load imbalances, which are preferably handled by replicating data (since replication also introduces fault tolerance benefits). In this paper, we present HotRoD, a DHT-based architecture that deals effectively with this combined problem through the use of a novel locality-preserving hash function, and a tunable data replication mechanism which allows trading off replication costs for fair load distribution. Our detailed experimentation study shows strong gains in both range query processing efficiency and data-access loadbalancing, with low replication overhead. To our knowledge, this is the first work that concurrently addresses the two conflicting problems using data replication.
Abstract: In this paper we present an efficient general simulation strategy for
computations designed for fully operational BSP machines of n ideal processors,
on n-processor dynamic-fault-prone BSP machines. The fault occurrences are failstop
and fully dynamic, i.e., they are allowed to happen on-line at any point of the
computation, subject to the constraint that the total number of faulty processors
may never exceed a known fraction. The computational paradigm can be exploited
for robust computations over virtual parallel settings with a volatile underlying
infrastructure, such as a NETWORK OF WORKSTATIONS (where workstations may be
taken out of the virtual parallel machine by their owner).
Our simulation strategy is Las Vegas (i.e., it may never fail, due to backtracking
operations to robustly stored instances of the computation, in case of locally
unrecoverable situations). It adopts an adaptive balancing scheme of the workload
among the currently live processors of the BSP machine.
Our strategy is efficient in the sense that, compared with an optimal off-line
adversarial computation under the same sequence of fault occurrences, it achieves an O
¡
.log n ¢ log log n/2¢
multiplicative factor times the optimal work (namely, this
measure is in the sense of the “competitive ratio” of on-line analysis). In addition,
our scheme is modular, integrated, and considers many implementation points.
We comment that, to our knowledge, no previous work on robust parallel computations
has considered fully dynamic faults in the BSP model, or in general distributed
memory systems. Furthermore, this is the first time an efficient Las Vegas
simulation in this area is achieved.
Abstract: We study the loadbalancing problem in the context of a set of clients each wishing to run a job on a server selected among a subset of permissible servers for the particular client. We consider two different scenarios. In selfish loadbalancing, each client is selfish in the sense that it selects to run its job to the server among its permissible servers having the smallest latency given the assignments of the jobs of other clients to servers. In online loadbalancing, clients appear online and, when a client appears, it has to make an irrevocable decision and assign its job to one of its permissible servers. Here, we assume that the clients aim to optimize some global criterion but in an online fashion. A natural local optimization criterion that can be used by each client when making its decision is to assign its job to that server that gives the minimum increase of the global objective. This gives rise to greedy online solutions. The aim of this paper is to determine how much the quality of loadbalancing is affected by selfishness and greediness.
We characterize almost completely the impact of selfishness and greediness in loadbalancing by presenting new and improved, tight or almost tight bounds on the price of anarchy and price of stability of selfish loadbalancing as well as on the competitiveness of the greedy algorithm for online loadbalancing when the objective is to minimize the total latency of all clients on servers with linear latency functions.
Abstract: We study the loadbalancing problem in the context of a set of clients each
wishing to run a job on a server selected among a subset of permissible servers for
the particular client. We consider two different scenarios. In selfish loadbalancing,
each client is selfish in the sense that it chooses, among its permissible servers, to
run its job on the server having the smallest latency given the assignments of the
jobs of other clients to servers. In online loadbalancing, clients appear online and,
when a client appears, it has to make an irrevocable decision and assign its job to
one of its permissible servers. Here, we assume that the clients aim to optimize some
global criterion but in an online fashion. A natural local optimization criterion that
can be used by each client when making its decision is to assign its job to that server that gives the minimum increase of the global objective. This gives rise to greedy
online solutions. The aim of this paper is to determine how much the quality of loadbalancing is affected by selfishness and greediness.
We characterize almost completely the impact of selfishness and greediness in
loadbalancing by presenting new and improved, tight or almost tight bounds on the
price of anarchy of selfish loadbalancing as well as on the competitiveness of the
greedy algorithm for online loadbalancing when the objective is to minimize the
total latency of all clients on servers with linear latency functions. In addition, we
prove a tight upper bound on the price of stability of linear congestion games.