The impact of the number of storage targets on performance in the access to parallel file systems

Where? Centre Inria de l’Université de Bordeaux, TADaaM team [1].

Advised by: Francieli Boito (francieli.zanon-boito@u-bordeaux.fr) and Luan Teylo (luan.gouveia-lima@inria.fr).

Keywords: high-performance computing, parallel I/O, parallel file systems, performance evaluation.

Context

In high-performance computing (HPC) platforms, the supercomputers, applications running on the compute nodes access persistent data in a remote parallel file system (PFS), which is deployed over a set of dedicated servers. Each PFS server has one or more storage targets (OSTs), usually each associated with a different storage device. In these systems, files are broken into fixed-size stripes and distributed across the storage targets, so that different stripes can be accessed in parallel. The access to PFS is thus called parallel I/O [2].

This research field is important because there is a historical gap between processing and I/O speeds in HPC systems. Consequently, even if compute-intensive, many HPC applications spend a lot of their execution time on I/O operations, which prevents them from scaling.

The performance observed by applications when accessing the I/O infrastructure is heavily impacted by the way this access is done: how many files, if they are shared by processes, how many processes and nodes are involved, how much data is moved, in requests of what size, what is the position of these requests in the files, etc. This set of characteristics is commonly called the application’s access pattern.

It has been observed that the number of storage targets used by an application has a strong impact on its performance [3]. Moreover, this impact is not the same for all applications, but depends on their access patterns [4]. Today, systems are not able to adapt to different application characteristics, and hence use a default number of OSTs to all, which is chosen to cover a frequent case but is suboptimal for many cases.

Objectives

This internship has two main objectives: i) to extend the IOPS tool [5]; and ii) to conduct experiments on different platforms, focusing on the impact of the number of storage targets.

Extending IOPS

The I/O Performance Evaluation Suite (IOPS) is a tool being developed in the TADaaM team of Inria Bordeaux to simplify the process of benchmark execution and results analysis in HPC systems. It uses IOR [6] to run experiments with different parameters. The goal of the tool is to automatize the performance evaluation process described in our paper [3], where we first explored number of nodes, processes and file size to find a configuration that reaches the system’s peak performance, and then used these parameters to study the impact of the number of OSTs.

The first improvement we want done to IOPS is to make it smarter. That is important because evaluating all combinations of parameters can take a long time and consume a lot of shared resources. Therefore, instead of testing all possible parameter combinations, IOPS should apply some heuristic to run only the necessary tests. The intern will participate of the work of proposing, developing and testing this heuristic.

Another improvement, also part of the internship, is to cover multiple access patterns. So far, the tool only supports sequential writes, but it should allow for any other access pattern (as long as it is supported by IOR) to be represented.

Performance Study

Once the IOPS tool has been extended, the intern will use it to conduct experiments on different systems (PlaFRIM [7], Grid5000 [8], production large-scale systems available through GENCI [9], etc) to measure the impact of the number of used storage targets on performance with different access patterns. Different systems will be used to cover different file systems (at least Lustre [10] and BeeGFS [11]), network topologies and speeds, etc.

In addition to IOPS, representative scientific applications of different types — numerical simulations, data analytics, deep learning training, and inference, etc. — will be included. Profiling and tracing tools, such as Darshan [12] and EZTrace [13], will be used to collect various metrics during execution. Obtaining these profiles and traces will be an important contribution of this internship, because the longer-term goal is to propose a technique to predict what will be an application’s performance with different numbers of OSTs.

Perspectives

The work in this internship may result in the student participating in scientific papers. Moreover, a Ph.D. student position, to continue working on this project, may be proposed to the person after the internship.

This position will be funded by the PEPR Numpex project and include collaboration with other French institutional and industry partners.

Requirements

The ideal person for this position:

  • is comfortable using Linux systems, command line, running MPI applications, submitting jobs to clusters, etc;
  • can write scripts to run experiments, parse and analyze results in Bash, Python, or R;
  • can create documents in Latex;
  • can communicate in English (even if not perfectly fluent);
  • is curious and has an interest in research.

Please notice these are not hard requirements, but desirable characteristics. The most important characteristic is that the person is reasonably independent and capable of learning whatever is required for conducting the proposed project.

References

[1] https://team.inria.fr/tadaam/ [2] F. Boito, E. Inacio, J. Bez, P. Navaux, M. Dantas, Y. Denneulin, A Checkpoint of Research on Parallel I/O for High Performance Computing. ACM Computing Surveys, 2018. https://hal.univ-grenoble-alpes.fr/hal-01591755 [3] F. Boito, G. Pallez, L. Teylo, The role of storage target allocation in applications’ I/O performance with BeeGFS, CLUSTER 2022. https://inria.hal.science/hal-03753813 [4] F. Chowdhury, Y. Zhu, T. Heer, S. Paredes, A. Moody, R. Goldstone, K. Mohror, W. Yu, I/O Characterization and Performance Evaluation of BeeGFS for Deep Learning, 2019. https://doi.org/10.1145/3337821.3337902 [5] I/O Performance Evaluation benchmark Suite (IOPS) https://gitlab.inria.fr/lgouveia/iops [6] https://ior.readthedocs.io/en/latest/userDoc/tutorial.html [7] https://www.plafrim.fr/ [8] https://www.grid5000.fr/w/Grid5000:Home [9] https://www.genci.fr/fr [10] https://www.lustre.org/ [11] https://www.beegfs.io/c/ [12] https://www.mcs.anl.gov/research/projects/darshan/ [13] https://eztrace.gitlab.io/eztrace/