Go to file
Rafael Ravedutti 2ddb8a2934 Add gather-bench as submodule
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
2022-09-29 14:55:11 +02:00
asm Add first version with more than one optimization scheme 2022-01-17 14:15:02 +01:00
common Fix header of likwid-marker.h 2022-09-29 11:48:05 +02:00
data Fix DEM setup 2022-07-19 04:13:06 +02:00
figures Update gather_bench image to PNG 2022-09-29 14:47:45 +02:00
gather-bench@2f654cb043 Add gather-bench as submodule 2022-09-29 14:55:11 +02:00
gromacs Switch copyright header in source files. 2022-09-05 10:39:42 +02:00
lammps Switch copyright header in source files. 2022-09-05 10:39:42 +02:00
util Switch copyright header in source files. 2022-09-05 10:39:42 +02:00
.gitignore Update .gitignore 2022-08-16 19:33:38 +02:00
.gitmodules Add gather-bench as submodule 2022-09-29 14:55:11 +02:00
config.mk Adjust ISA options and improve output 2022-08-16 18:36:47 +02:00
include_CLANG.mk Update compilation flags for all available compilers 2022-01-17 11:40:44 +01:00
include_GCC.mk Avoid errors when compiling for AVX2 due to SIMD LJ implementation 2022-07-19 02:30:26 +02:00
include_GROMACS.mk Add SIMD version with AVX (no AVX2) and XTC output 2022-03-02 23:12:04 +01:00
include_ICC.mk Add ONEAPI config. Remove omp simd for full neigh. 2022-04-01 15:57:54 +02:00
include_ISA.mk Adjust ISA options and improve output 2022-08-16 18:36:47 +02:00
include_LIKWID.mk Add LIKWID Option. Allow to overwrite with asm variant. 2021-06-11 09:38:34 +02:00
include_NVCC.mk Integrate LAMMPS CUDA versions into master branch 2022-08-09 18:53:53 +02:00
include_ONEAPI.mk Add ONEAPI config. Remove omp simd for full neigh. 2022-04-01 15:57:54 +02:00
LICENSE Switch License to LGPL3 2020-08-19 10:47:40 +02:00
Makefile Introduce common directory 2022-08-17 17:20:57 +02:00
README.md Update gather_bench image to PNG 2022-09-29 14:47:45 +02:00

MD-Bench

Image

MD-Bench is a toolbox for the performance engineering of short-range force calculation kernels on molecular-dynamics applications. It aims at covering all available state-of-the-art algorithms from different community codes such as LAMMPS and GROMACS.

Apart from that, many tools to study and evaluate the in-depth performance of such kernels on distinct hardware are offered, like the gather-bench which is a benchmark to mimic the data movement from such kernels and the stubbed force calculation cases used to isolate the impacts caused by memory latencies and control flow divergence.

Verlet Lists GROMACS MxN Stubbed cases
Image Image Image

Image

Build instructions

Properly configure your building by changing config.mk file. The following options are available:

  • TAG: Compiler tag (available options: GCC, CLANG, ICC, ONEAPI, NVCC).
  • ISA: Instruction set (available options: SSE, AVX, AVX2, AVX512).
  • MASK_REGISTERS: Use AVX512 mask registers (always true when ISA is set to AVX512).
  • OPT_SCHEME: Optimization algorithm (available options: lammps, gromacs).
  • ENABLE_LIKWID: Enable likwid to make use of HPM counters.
  • DATA_TYPE: Floating-point precision (available options: SP, DP).
  • DATA_LAYOUT: Data layout for atom vector properties (available options: AOS, SOA).
  • ASM_SYNTAX: Assembly syntax to use when generating assembly files (available options: ATT, INTEL).
  • DEBUG: Toggle debug mode.
  • EXPLICIT_TYPES: Explicitly store and load atom types.
  • MEM_TRACER: Trace memory addresses for cache simulator.
  • INDEX_TRACER: Trace indexes and distances for gather-md.
  • COMPUTE_STATS: Compute statistics.

Configurations for LAMMPS Verlet Lists optimization scheme:

  • ENABLE_OMP_SIMD: Use omp simd pragma on half neighbor-lists kernels.
  • USE_SIMD_KERNEL: Compile kernel with explicit SIMD intrinsics.

Configurations for GROMACS MxN optimization scheme:

  • USE_REFERENCE_VERSION: Use reference version (only for correction purposes).
  • XTC_OUTPUT: Enable XTC output.
  • HALF_NEIGHBOR_LISTS_CHECK_CJ: Check if j-clusters are local when decreasing the reaction force.

Configurations for CUDA:

  • USE_CUDA_HOST_MEMORY: Use CUDA host memory to optimize host-device transfers.

When done, just use make to compile the code. You can clean intermediate build results with make clean, and all build results with make distclean. You have to call make clean before make if you changed the build settings.

Usage

Use the following command to run a simulation:

./MD-Bench-<TAG>-<OPT_SCHEME> [OPTION]...

Where TAG and OPT_SCHEME correspond to the building options with the same name. Without any options, a Copper FCC lattice system with size 32x32x32 (131072 atoms) over 200 time-steps using the Lennard-Jones potential (sigma=1.0, epsilon=1.0) is simulated.

The default behavior and other options can be changed using the following parameters:

-p <string>:          file to read parameters from (can be specified more than once)
-f <string>:          force field (lj or eam), default lj
-i <string>:          input file with atom positions (dump)
-e <string>:          input file for EAM
-n / --nsteps <int>:  set number of timesteps for simulation
-nx/-ny/-nz <int>:    set linear dimension of systembox in x/y/z direction
-r / --radius <real>: set cutoff radius
-s / --skin <real>:   set skin (verlet buffer)
--freq <real>:        processor frequency (GHz)
--vtk <string>:       VTK file for visualization
--xtc <string>:       XTC file for visualization

Examples

Citations

R. Ravedutti Lucio Machado, J. Eitzinger, H. Köstler, and G. Wellein: MD-Bench: A generic proxy-app toolbox for state-of-the-art molecular dynamics algorithms. Accepted for PPAM 2022, the 14th International Conference on Parallel Processing and Applied Mathematics, Gdansk, Poland, September 11-14, 2022. PPAM 2022 Best Paper Award. Preprint: arXiv:2207.13094

Credits

MD-Bench is developed by the Erlangen National High Performance Computing Center (NHR@FAU) at the University of Erlangen-Nürnberg.

License

LGPL-3.0