T

Rafael Ravedutti c0a54190d8 Update figures again

Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>

2022-09-29 14:35:30 +02:00

asm

Add first version with more than one optimization scheme

2022-01-17 14:15:02 +01:00

common

Fix header of likwid-marker.h

2022-09-29 11:48:05 +02:00

data

Fix DEM setup

2022-07-19 04:13:06 +02:00

figures

Update figures again

2022-09-29 14:35:30 +02:00

gromacs

Switch copyright header in source files.

2022-09-05 10:39:42 +02:00

lammps

Switch copyright header in source files.

2022-09-05 10:39:42 +02:00

util

Switch copyright header in source files.

2022-09-05 10:39:42 +02:00

.gitignore

Update .gitignore

2022-08-16 19:33:38 +02:00

config.mk

Adjust ISA options and improve output

2022-08-16 18:36:47 +02:00

include_CLANG.mk

Update compilation flags for all available compilers

2022-01-17 11:40:44 +01:00

include_GCC.mk

Avoid errors when compiling for AVX2 due to SIMD LJ implementation

2022-07-19 02:30:26 +02:00

include_GROMACS.mk

Add SIMD version with AVX (no AVX2) and XTC output

2022-03-02 23:12:04 +01:00

include_ICC.mk

Add ONEAPI config. Remove omp simd for full neigh.

2022-04-01 15:57:54 +02:00

include_ISA.mk

Adjust ISA options and improve output

2022-08-16 18:36:47 +02:00

include_LIKWID.mk

Add LIKWID Option. Allow to overwrite with asm variant.

2021-06-11 09:38:34 +02:00

include_NVCC.mk

Integrate LAMMPS CUDA versions into master branch

2022-08-09 18:53:53 +02:00

include_ONEAPI.mk

Add ONEAPI config. Remove omp simd for full neigh.

2022-04-01 15:57:54 +02:00

LICENSE

Switch License to LGPL3

2020-08-19 10:47:40 +02:00

Makefile

Introduce common directory

2022-08-17 17:20:57 +02:00

README.md

Update figures again

2022-09-29 14:35:30 +02:00

README.md

MD-Bench

MD-Bench is a toolbox for the performance engineering of short-range force calculation kernels on molecular-dynamics applications. It aims at covering all available state-of-the-art algorithms from different community codes such as LAMMPS and GROMACS.

Besides, many tools to study and evaluate the in-depth performance of such kernels on distinct hardware are made available like the gather-bench which is a benchmark to mimic the data movement from such kernels and the stubbed force calculation cases used to isolate the impacts caused by memory latencies and control flow divergence.

Build instructions

Properly configure your building by changing config.mk file. The following options are available:

TAG: Compiler tag (available options: GCC, CLANG, ICC, ONEAPI, NVCC).
ISA: Instruction set (available options: SSE, AVX, AVX2, AVX512).
MASK_REGISTERS: Use AVX512 mask registers (always true when ISA is set to AVX512).
OPT_SCHEME: Optimization algorithm (available options: lammps, gromacs).
ENABLE_LIKWID: Enable likwid to make use of HPM counters.
DATA_TYPE: Floating-point precision (available options: SP, DP).
DATA_LAYOUT: Data layout for atom vector properties (available options: AOS, SOA).
ASM_SYNTAX: Assembly syntax to use when generating assembly files (available options: ATT, INTEL).
DEBUG: Toggle debug mode.
EXPLICIT_TYPES: Explicitly store and load atom types.
MEM_TRACER: Trace memory addresses for cache simulator.
INDEX_TRACER: Trace indexes and distances for gather-md.
COMPUTE_STATS: Compute statistics.

Configurations for LAMMPS Verlet Lists optimization scheme:

ENABLE_OMP_SIMD: Use omp simd pragma on half neighbor-lists kernels.
USE_SIMD_KERNEL: Compile kernel with explicit SIMD intrinsics.

Configurations for GROMACS MxN optimization scheme:

USE_REFERENCE_VERSION: Use reference version (only for correction purposes).
XTC_OUTPUT: Enable XTC output.
HALF_NEIGHBOR_LISTS_CHECK_CJ: Check if j-clusters are local when decreasing the reaction force.

Configurations for CUDA:

USE_CUDA_HOST_MEMORY: Use CUDA host memory to optimize host-device transfers.

When done, just use make to compile the code. You can clean intermediate build results with make clean, and all build results with make distclean. You have to call make clean before make if you changed the build settings.

Usage

Use the following command to run a simulation:

./MD-Bench-<TAG>-<OPT_SCHEME> [OPTION]...

Where TAG and OPT_SCHEME correspond to the building options with the same name. Without any options, a Copper FCC lattice system with size 32x32x32 (131072 atoms) over 200 time-steps using the Lennard-Jones potential (sigma=1.0, epsilon=1.0) is simulated.

The default behavior and other options can be changed using the following parameters:

-p <string>:          file to read parameters from (can be specified more than once)
-f <string>:          force field (lj or eam), default lj
-i <string>:          input file with atom positions (dump)
-e <string>:          input file for EAM
-n / --nsteps <int>:  set number of timesteps for simulation
-nx/-ny/-nz <int>:    set linear dimension of systembox in x/y/z direction
-r / --radius <real>: set cutoff radius
-s / --skin <real>:   set skin (verlet buffer)
--freq <real>:        processor frequency (GHz)
--vtk <string>:       VTK file for visualization
--xtc <string>:       XTC file for visualization

Examples

Citations

R. Ravedutti Lucio Machado, J. Eitzinger, H. Köstler, and G. Wellein: MD-Bench: A generic proxy-app toolbox for state-of-the-art molecular dynamics algorithms. Accepted for PPAM 2022, the 14th International Conference on Parallel Processing and Applied Mathematics, Gdansk, Poland, September 11-14, 2022. PPAM 2022 Best Paper Award. Preprint: arXiv:2207.13094

Credits

MD-Bench is developed by the Erlangen National High Performance Computing Center (NHR@FAU) at the University of Erlangen-Nürnberg.

License

LGPL-3.0