Abstract | The existence of good probabilistic models
for the job arrival process and the delay components
introduced at different stages of job processing in a
Grid environment is important for the improved
understanding of the Grid computing concept. In this
study, we present a thorough analysis of the job
arrival process in the EGEE infrastructure and of the
time durations a job spends at different states in the
EGEE environment. We define four delay compo-
nents of the total job delay and model each compo-
nent separately. We observe that the job inter-arrival
times at the Grid level can be adequately modelled by
a rounded exponential distribution, while the total job
delay (from the time it is generated until the time it
completes execution) is dominated by the computing
element’s register and queuing times and the worker
node’s execution times. Further, we evaluate the
efficiency of the EGEE environment by comparing
the job total delay performance with that of a hypothetical ideal super-cluster and conclude that we
would obtain similar performance if we submitted the
same workload to a super-cluster of size equal to 34%
of the total average number of CPUs participating in
the EGEE infrastructure. We also analyze the job
inter-arrival times, the CE’s queuing times, the WN’s
execution times, and the data sizes exchanged at the
kallisto.hellasgrid.gr cluster, which is node in the
EGEE infrastructure. In contrast to the Grid level, we
find that at the cluster level the job arrival process
exhibits self-similarity/long-range dependence. Final-
ly, we propose simple and intuitive models for the job
arrival process and the execution times at the cluster
level. |