Abstract: We present a simple parallel algorithm for the single-source shortest path
problem in planar digraphs with nonnegative real edge weights. The algorithm runs
on the EREW PRAM model of parallelcomputation in O((n2=+n1&=) log n)
time, performing O(n1+= log n) work for any 0<{\aa}<1/2. The strength of the
algorithm is its simplicity, making it easy to implement and presumable quite
efficient in practice. The algorithm improves upon the work of all previous
parallel algorithms. Our algorithm is based on a region decomposition of the
input graph and uses a well-known parallel implementation of Dijkstra's
algorithm. The logarithmic factor in both the work and the time can be
eliminated by plugging in a less practical, sequential planar shortest path
algorithm together with an improved parallel implementation of Dijkstra's
algorithm.

Abstract: Andrews et al. [Automatic method for hiding latency in high bandwidth networks, in: Proceedings of the ACM Symposium on Theory of Computing, 1996, pp. 257–265; Improved methods for hiding latency in high bandwidth networks, in: Proceedings of the Eighth Annual ACM Symposium on Parallel Algorithms and Architectures, 1996, pp. 52–61] introduced a number of techniques for automatically hiding latency when performing simulations of networks with unit delay links on networks with arbitrary unequal delay links. In their work, they assume that processors of the host network are identical in computational power to those of the guest network being simulated. They further assume that the links of the host are able to pipeline messages, i.e., they are able to deliver P packets in time O(P+d) where d is the delay on the link.
In this paper we examine the effect of eliminating one or both of these assumptions. In particular, we provide an efficient simulation of a linear array of homogeneous processors connected by unit-delay links on a linear array of heterogeneous processors connected by links with arbitrary delay. We show that the slowdown achieved by our simulation is optimal. We then consider the case of simulating cliques by cliques; i.e., a clique of heterogeneous processors with arbitrary delay links is used to simulate a clique of homogeneous processors with unit delay links. We reduce the slowdown from the obvious bound of the maximum delay link to the average of the link delays. In the case of the linear array we consider both links with and without pipelining. For the clique simulation the links are not assumed to support pipelining.
The main motivation of our results (as was the case with Andrews et al.) is to mitigate the degradation of performance when executing parallel programs designed for different architectures on a network of workstations (NOW). In such a setting it is unlikely that the links provided by the NOW will support pipelining and it is quite probable the processors will be heterogeneous. Combining our result on clique simulation with well-known techniques for simulating shared memory PRAMs on distributed memory machines provides an effective automatic compilation of a PRAM algorithm on a NOW.

Abstract: We study the combinatorial structure and computational complexity of extreme Nash equilibria, ones that maximize or minimize a certain objective function, in the context of a selfish routing game. Specifically, we assume a collection of n users, each employing a mixed strategy, which is a probability distribution over m parallel links, to control the routing of its own assigned traffic. In a Nash equilibrium, each user routes its traffic on links that minimize its expected latency cost.
Our structural results provide substantial evidence for the Fully Mixed Nash Equilibrium Conjecture, which states that the worst Nash equilibrium is the fully mixed Nash equilibrium, where each user chooses each link with positive probability. Specifically, we prove that the Fully Mixed Nash Equilibrium Conjecture is valid for pure Nash equilibria and that under a certain condition, the social cost of any Nash equilibrium is within a factor of 6 + epsi, of that of the fully mixed Nash equilibrium, assuming that link capacities are identical.
Our complexity results include hardness, approximability and inapproximability ones. Here we show, that for identical link capacities and under a certain condition, there is a randomized, polynomial-time algorithm to approximate the worst social cost within a factor arbitrarily close to 6 + epsi. Furthermore, we prove that for any arbitrary integer k > 0, it is -hard to decide whether or not any given allocation of users to links can be transformed into a pure Nash equilibrium using at most k selfish steps. Assuming identical link capacities, we give a polynomial-time approximation scheme (PTAS) to approximate the best social cost over all pure Nash equilibria. Finally we prove, that it is -hard to approximate the worst social cost within a multiplicative factor . The quantity is the tight upper bound on the ratio of the worst social cost and the optimal cost in the model of identical capacities.

Abstract: The simplex method has been successfully used in solving linear programming problems for many years. Parallel approaches have also extensively been studied due to the intensive computations required, especially for the solution of large linear problems (LPs). In this paper we present a highly scalable parallel implementation framework of the standard full tableau simplex method on a highly parallel (distributed memory) environment. Speciﬁcally, we have designed and implemented a suitable column distribution scheme as well as a row distribution scheme and we have entirely tested our implementations over a considerably powerful distributed platform (linux cluster with myrinet interface). We then compare our approaches (a) among each other for variable number of problem size (number of rows and columns) and (b) to other recent and valuable corresponding eﬀorts in the literature. In most cases, the column distribution scheme performs quite/much better than the row distribution scheme. Moreover, both schemes (even the row distribution scheme over large-scale problems) lead to particularly high speedup and eﬃciency values, which are considerably better in all cases than the ones achieved in other similar research eﬀorts and implementations. Moreover, we further evaluate our basic parallelization scheme over very large LPs in order to validate more reliably the high eﬃciency and scalability achieved.

Abstract: In routing games, the network performance at equilibrium can be significantly improved if we remove some edges from the network. This counterintuitive fact, widely known as Braess's paradox, gives rise to the (selfish) network design problem, where we seek to recognize routing games suffering from the paradox, and to improve the equilibrium performance by edge removal. In this work, we investigate the computational complexity and the approximability of the network design problem for non-atomic bottleneck routing games, where the individual cost of each player is the bottleneck cost of her path, and the social cost is the bottleneck cost of the network. We first show that bottleneck routing games do not suffer from Braess's paradox either if the network is series-parallel, or if we consider only subpath-optimal Nash flows. On the negative side, we prove that even for games with strictly increasing linear latencies, it is NP-hard not only to recognize instances suffering from the paradox, but also to distinguish between instances for which the Price of Anarchy (PoA) can decrease to 1 and instances for which the PoA is as large as \Omega(n^{0.121}) and cannot improve by edge removal. Thus, the network design problem for such games is NP-hard to approximate within a factor of O(n^{0.121-\eps}), for any constant \eps > 0. On the positive side, we show how to compute an almost optimal subnetwork w.r.t. the bottleneck cost of its worst Nash flow, when the worst Nash flow in the best subnetwork routes a non-negligible amount of flow on all used edges. The running time is determined by the total number of paths, and is quasipolynomial when the number of paths is quasipolynomial.

Abstract: In this paper we present an efficient general simulation strategy for
computations designed for fully operational BSP machines of n ideal processors,
on n-processor dynamic-fault-prone BSP machines. The fault occurrences are failstop
and fully dynamic, i.e., they are allowed to happen on-line at any point of the
computation, subject to the constraint that the total number of faulty processors
may never exceed a known fraction. The computational paradigm can be exploited
for robust computations over virtual parallel settings with a volatile underlying
infrastructure, such as a NETWORK OF WORKSTATIONS (where workstations may be
taken out of the virtual parallel machine by their owner).
Our simulation strategy is Las Vegas (i.e., it may never fail, due to backtracking
operations to robustly stored instances of the computation, in case of locally
unrecoverable situations). It adopts an adaptive balancing scheme of the workload
among the currently live processors of the BSP machine.
Our strategy is efficient in the sense that, compared with an optimal off-line
adversarial computation under the same sequence of fault occurrences, it achieves an O
¡
.log n ¢ log log n/2¢
multiplicative factor times the optimal work (namely, this
measure is in the sense of the “competitive ratio” of on-line analysis). In addition,
our scheme is modular, integrated, and considers many implementation points.
We comment that, to our knowledge, no previous work on robust parallelcomputations
has considered fully dynamic faults in the BSP model, or in general distributed
memory systems. Furthermore, this is the first time an efficient Las Vegas
simulation in this area is achieved.

Abstract: We consider the problem of preprocessing an n-vertex digraph with real edge weights so that subsequent queries for the shortest path or distance between any two vertices can be efficiently answered. We give parallel algorithms for the EREW PRAM model of computation that depend on the treewidth of the input graph. When the treewidth is a constant, our algorithms can answer distance queries in O({\'a}(n)) time using a single processor, after a preprocessing of O(log2n) time and O(n) work, where {\'a}(n) is the inverse of Ackermann's function. The class of constant treewidth graphs contains outerplanar graphs and series-parallel graphs, among others. To the best of our knowledge, these are the first parallel algorithms which achieve these bounds for any class of graphs except trees. We also give a dynamic algorithm which, after a change in an edge weight, updates our data structures in O(log n) time using O(n{\^a}) work, for any constant 0 < {\^a} < 1. Moreover, we give an algorithm of independent interest: computing a shortest path tree, or finding a negative cycle in O(log2n) time using O(n) work.

Abstract: We study computational and coordination efficiency issues of
Nash equilibria in symmetric network congestion games. We first propose
a simple and natural greedy method that computes a pure Nash equilibrium
with respect to traffic congestion in a network. In this algorithm
each user plays only once and allocates her traffic to a path selected via
a shortest path computation. We then show that this algorithm works
for series-parallel networks when users are identical or when users are of
varying demands but have the same best response strategy for any initial
network traffic. We also give constructions where the algorithm fails if
either the above condition is violated (even for series-parallel networks)
or the network is not series-parallel (even for identical users). Thus, we
essentially indicate the limits of the applicability of this greedy approach.
We also study the price of anarchy for the objective of maximum
latency. We prove that for any network of m uniformly related links and
for identical users, the price of anarchy is {\`E}( logm
log logm).

Abstract: Let M be a single s-t network of parallel links with load dependent latency functions shared by an infinite number of selfish users. This may yield a Nash equilibrium with unbounded Coordination Ratio [E. Koutsoupias, C. Papadimitriou, Worst-case equilibria, in: 16th Annual Symposium on Theoretical Aspects of Computer Science, STACS, vol. 1563, 1999, pp. 404-413; T. Roughgarden, E. Tardos, How bad is selfish routing? in: 41st IEEE Annual Symposium of Foundations of Computer Science, FOCS, 2000, pp. 93-102]. A Leader can decrease the coordination ratio by assigning flow {\'a}r on M, and then all Followers assign selfishly the (1-{\'a})r remaining flow. This is a Stackelberg Scheduling Instance(M,r,{\'a}),0≤{\'a}≤1. It was shown [T. Roughgarden, Stackelberg scheduling strategies, in: 33rd Annual Symposium on Theory of Computing, STOC, 2001, pp. 104-113] that it is weakly NP-hard to compute the optimal Leader's strategy. For any such network M we efficiently compute the minimum portion @b"M of flow r>0 needed by a Leader to induce M's optimum cost, as well as her optimal strategy. This shows that the optimal Leader's strategy on instances (M,r,@a>=@b"M) is in P. Unfortunately, Stackelberg routing in more general nets can be arbitrarily hard. Roughgarden presented a modification of Braess's Paradox graph, such that no strategy controlling {\'a}r flow can induce ≤1/{\'a} times the optimum cost. However, we show that our main result also applies to any s-t net G. We take care of the Braess's graph explicitly, as a convincing example. Finally, we extend this result to k commodities. A conference version of this paper has appeared in [A. Kaporis, P. Spirakis, The price of optimum in stackelberg games on arbitrary single commodity networks and latency functions, in: 18th annual ACM symposium on Parallelism in Algorithms and Architectures, SPAA, 2006, pp. 19-28]. Some preliminary results have also appeared as technical report in [A.C. Kaporis, E. Politopoulou, P.G. Spirakis, The price of optimum in stackelberg games, in: Electronic Colloquium on Computational Complexity, ECCC, (056), 2005].

Abstract: In this work, we study the combinatorial structure and the
computational complexity of Nash equilibria for a certain game that
models selfish routing over a network consisting of m parallel links. We
assume a collection of n users, each employing a mixed strategy, which
is a probability distribution over links, to control the routing of its own
assigned traffic. In a Nash equilibrium, each user selfishly routes its traffic
on those links that minimize its expected latency cost, given the network
congestion caused by the other users. The social cost of a Nash equilibrium
is the expectation, over all random choices of the users, of the
maximum, over all links, latency through a link.
We embark on a systematic study of several algorithmic problems related
to the computation of Nash equilibria for the selfish routing game we consider.
In a nutshell, these problems relate to deciding the existence of a
Nash equilibrium, constructing a Nash equilibrium with given support
characteristics, constructing the worst Nash equilibrium (the one with
maximum social cost), constructing the best Nash equilibrium (the one
with minimum social cost), or computing the social cost of a (given) Nash
equilibrium. Our work provides a comprehensive collection of efficient algorithms,
hardness results (both as NP-hardness and #P-completeness
results), and structural results for these algorithmic problems. Our results
span and contrast a wide range of assumptions on the syntax of the
Nash equilibria and on the parameters of the system.

Abstract: We study computationally hard combinatorial problems arising from the important engineering question of how to maximize the number of connections that can be simultaneously served in a WDM optical network. In such networks, WDM technology can satisfy a set of connections by computing a route and assigning a wavelength to each connection so that no two connections routed through the same fiber are assigned the same wavelength. Each fiber supports a limited number of w wavelengths and in order to fully exploit the parallelism provided by the technology, one should select a set connections of maximum cardinality which can be satisfied using the available wavelengths. This is known as the maximum routing and path coloring problem (maxRPC).
Our main contribution is a general analysis method for a class of iterative algorithms for a more general coloring problem. A lower bound on the benefit of such an algorithm in terms of the optimal benefit and the number of available wavelengths is given by a benefit-revealing linear program. We apply this method to maxRPC in both undirected and bidirected rings to obtain bounds on the approximability of several algorithms. Our results also apply to the problem maxPC where paths instead of connections are given as part of the input. We also study the profit version of maxPC in rings where each path has a profit and the objective is to satisfy a set of paths of maximum total profit.