Publication: A Co-allocation Strategy for Jobs in Multiple HPC Clusters

All || By Area || By Year

Title A Co-allocation Strategy for Jobs in Multiple HPC Clusters
Authors/Editors* J. Qin, M. Bauer
Where published* The 19th International Conference on Parallel and Distributed Computing Systems (PDCS-2006)
How published* Proceedings
Year* 2006
Pages 114-121
Publisher ISCA
Keywords allocation, multiple HPC clusters, HPC grid
To more effectively use HPC clusters for even larger computations, reduce turn-around times and better utilize compute resource, users are looking to interconnect multiple HPC clusters, creating a grid. To effectively use such grids, it may be desirable to split and co-allocate jobs requiring many processors across multiple clusters. While splitting a very large job across multiple clusters is an attractive possibility, the benefit, in terms of reducing turn-around time, ultimately depends on the communication patterns between processes, workload on the communication links, and the maximum bandwidth of the links. In this research, a resource management system model for multi-cluster grid is presented, and a scalable job scheduling and job allocation strategy for job co-allocation is proposed. A simulator developed in our previous work, which simulates the dynamic behavior of jobs across multiple clusters, is used to evaluate the proposed co-allocation strategy. The results conclude that properly selecting threshold values for link saturation level control and chunk size control in splitting jobs, the proposed co-allocation strategy can significantly improve both user’ satisfaction and the system resource utilization even for jobs having large communication requirements.
Go to Distributed Systems
Back to page 75 of list