Commit Graph

  • bc06220aeb Remove AVX512 reciprocal usage in AVX2 file Rafael Ravedutti 2022-11-15 01:40:37 +0100
  • efa462d0af Add AVX_FMA ISA Rafael Ravedutti 2022-11-15 01:24:30 +0100
  • cd1fbfb3c8 Reorganize SIMD files and split AVX and AVX2 Rafael Ravedutti 2022-11-15 00:55:46 +0100
  • f293cec960 Call CPU version of updatePbc within setupPbc Rafael Ravedutti 2022-11-14 19:19:57 +0100
  • 6eedf1776e Small fixes into GROMACS GPU code Rafael Ravedutti 2022-11-14 18:21:14 +0100
  • 93188d1383 Adjust NVCC flags to avoid issues with atomicAdd with doubles Rafael Ravedutti 2022-11-14 18:01:46 +0100
  • c70ebce4c1 Integrate GROMACS GPU implementation into master branch Rafael Ravedutti 2022-11-08 18:33:23 +0100
  • 493915fe95 Fix code for AVX and remove warnings Rafael Ravedutti 2022-11-08 15:30:37 +0100
  • 437b380229 Adjust NVCC flags Rafael Ravedutti 2022-11-07 20:37:01 +0100
  • c4304e3619 Update figure widths again Rafael Ravedutti 2022-09-29 18:41:40 +0200
  • b774e771ba Update width of figures in table Rafael Ravedutti 2022-09-29 18:38:32 +0200
  • e86caa92b1 Fix Verlet Lists figure href Rafael Ravedutti 2022-09-29 18:24:16 +0200
  • b201055658 Update table with HTML Rafael Ravedutti 2022-09-29 18:23:29 +0200
  • 8fce79dda6 Update README.md Rafael Ravedutti 2022-09-29 17:56:37 +0200
  • 1421a023a9 Remove gather-bench image Rafael Ravedutti 2022-09-29 17:50:24 +0200
  • d3811c35c6 Update table with figures Rafael Ravedutti 2022-09-29 17:48:39 +0200
  • 239eea86b4 Update gather_bench figure Rafael Ravedutti 2022-09-29 17:44:35 +0200
  • 2ddb8a2934 Add gather-bench as submodule Rafael Ravedutti 2022-09-29 14:55:11 +0200
  • 1af19ad586 Update gather_bench image to PNG Rafael Ravedutti 2022-09-29 14:47:45 +0200
  • b9fadd7fbf Update introduction text and add gather bench figure Rafael Ravedutti 2022-09-29 14:46:24 +0200
  • 12e7718a5f Insert stubbed case into table as well Rafael Ravedutti 2022-09-29 14:38:08 +0200
  • 4ddd84ef9d Insert table with figures Rafael Ravedutti 2022-09-29 14:37:11 +0200
  • c0a54190d8 Update figures again Rafael Ravedutti 2022-09-29 14:35:30 +0200
  • bc8f0e7c35 Update figs Rafael Ravedutti 2022-09-29 14:28:18 +0200
  • 9301610f7c Add more figures Rafael Ravedutti 2022-09-29 14:23:34 +0200
  • 70a2f48d64 Add links and figures to README Rafael Ravedutti 2022-09-29 14:08:10 +0200
  • 94abf8b362 Add new sections Rafael Ravedutti 2022-09-29 12:52:54 +0200
  • da75f2cc36 Add usage section Rafael Ravedutti 2022-09-29 12:39:09 +0200
  • 880b82a86d Update README.md with config.mk options Rafael Ravedutti 2022-09-29 12:21:29 +0200
  • 35a8e3eeb7 Fix header of likwid-marker.h Rafael Ravedutti 2022-09-29 11:48:05 +0200
  • 3c02a3fb7a Update README.md Jan Eitzinger 2022-09-14 11:05:57 +0200
  • 3d0f4b97ee Switch copyright header in source files. Jan Eitzinger 2022-09-05 10:39:42 +0200
  • 28d3946072 Move common modules to common directory Rafael Ravedutti 2022-08-17 17:56:31 +0200
  • 47db9e86b0 Introduce common directory Rafael Ravedutti 2022-08-17 17:20:57 +0200
  • 418f392a11 Update .gitignore Rafael Ravedutti 2022-08-16 19:33:38 +0200
  • 29fa08fa7f Enhance output for gromacs variant Rafael Ravedutti 2022-08-16 19:32:49 +0200
  • 911ba63336 Adjust ISA options and improve output Rafael Ravedutti 2022-08-16 18:36:47 +0200
  • 0caeea0494 Rename cuda.c to device.c Rafael Ravedutti 2022-08-12 18:17:07 +0200
  • 90609a2b5f Adjust file structure for CUDA Rafael Ravedutti 2022-08-12 18:12:29 +0200
  • 939197a785 Create separate structs DeviceAtom and DeviceNeighbor with device pointers Rafael Ravedutti 2022-08-12 17:28:06 +0200
  • 065b596074 Initial refactoring of CUDA code Rafael Ravedutti 2022-08-12 04:19:38 +0200
  • 959ff65126 Fix macro condition Rafael Ravedutti 2022-08-12 01:29:40 +0200
  • 87d006d418 Fix GPU version Rafael Ravedutti 2022-08-11 16:42:41 +0200
  • 3d95ec4b0a Small fixes Rafael Ravedutti 2022-08-09 19:19:48 +0200
  • c18124b066 Integrate LAMMPS CUDA versions into master branch Rafael Ravedutti 2022-08-09 18:53:53 +0200
  • bc7b523979 Move src directory to lammps cuda_port Rafael Ravedutti 2022-08-04 17:25:31 +0200
  • eeba125a52 Remove likwid and architecture-specific compilation flags Rafael Ravedutti 2022-08-04 17:09:17 +0200
  • b32254b03f Changed data types in currently unused sort method to also work with single precision floating numbers Martin Bauernfeind 2022-07-22 13:55:27 +0200
  • 4dac820784 Added newline in output to improve formatting Martin Bauernfeind 2022-07-20 23:04:22 +0200
  • fe56c50efd Added one more output line to output the force kernel throughput Martin Bauernfeind 2022-07-20 22:43:57 +0200
  • 7a61cbbabf Instrumented the reneighbor function in order to obtain runtimes of its compontents Martin Bauernfeind 2022-07-19 20:38:11 +0200
  • eb77e1a3bd Fix DEM setup Rafael Ravedutti 2022-07-19 04:13:06 +0200
  • 2e77f6207b Avoid errors when compiling for AVX2 due to SIMD LJ implementation Rafael Ravedutti 2022-07-19 02:30:26 +0200
  • 176de0525b Instrumented the reneighbor function with timers (via getTimestamp()) to measure the runtime of its different components/methods Martin Bauernfeind 2022-07-17 18:34:17 +0200
  • 7bad7e84b6 Fixed compiler errors Martin Bauernfeind 2022-07-13 14:52:37 +0200
  • fb304f240b Small changes in buildNeighbor to initialize the bincount list and other arrays only once Martin Bauernfeind 2022-07-13 14:42:34 +0200
  • 5a6d1851ed Ported updateAtomsPbc to cuda and changed the code to use the cuda version from now on Martin Bauernfeind 2022-07-13 14:07:19 +0200
  • 577955dfb7 Apply first changes to DEM kernel Rafael Ravedutti 2022-07-13 02:34:33 +0200
  • f61f59ba3f Fixed a compiler error and removed an unnecessary memcpy (from device to host) - performance seems to have crossed the 300M updates/second mark for the A100 Martin Bauernfeind 2022-07-11 00:55:42 +0200
  • d1c2249b55 Added code to sort the contents of all bins to make it comparable to the CPU version Martin Bauernfeind 2022-07-11 00:24:48 +0200
  • c9db6e45fa Fixed compiler errors Martin Bauernfeind 2022-07-10 21:13:37 +0200
  • 0967e8f671 The program now does the binning on the GPU via the binatoms_cuda method Martin Bauernfeind 2022-07-10 18:05:06 +0200
  • 99237241fb Include domain box on DEM input file Rafael Ravedutti 2022-07-08 23:15:30 +0200
  • fa409c016c Added a struct to contain binning information such as the pointer to bincount and bins - not used yet Martin Bauernfeind 2022-07-08 13:52:45 +0200
  • 3b85da83a7 Update timestep size for dem Rafael Ravedutti 2022-07-08 02:56:56 +0200
  • 814f561993 Allow PBC in just some directions Rafael Ravedutti 2022-07-08 02:30:03 +0200
  • 32836eebcb Setup first DEM example with input file from lecture Rafael Ravedutti 2022-07-07 02:11:50 +0200
  • 9ffc09f497 Add DEM kernel to parameter options Rafael Ravedutti 2022-07-07 00:47:38 +0200
  • b65199308d Ported the binatoms method to cuda - not used in the program yet Martin Bauernfeind 2022-07-06 01:09:11 +0200
  • 79483a446e Adjust code with DEM to be compilable Rafael Ravedutti 2022-07-06 01:07:39 +0200
  • bb599c9ea8 Add first version of DEM kernel Rafael Ravedutti 2022-07-05 15:33:31 +0200
  • 71798f5ec5 🐛 Fixed aforementioned correctness issue by deleting a superflous cudaMemcpy in computeForce() that was overwriting correct data with incorrect data Martin Bauernfeind 2022-07-05 00:54:11 +0200
  • 4f0403d3ea Fixed an correctness issue by conservatively copying over data from and to the GPU Martin Bauernfeind 2022-07-05 00:33:12 +0200
  • fa86e44f90 Fixed wrong number of threadblock being launched Martin Bauernfeind 2022-07-04 19:36:09 +0200
  • 7e8fd96fa4 Fixed some compiler errors - the simulation seems to be off regarding how many ghost atoms are used -> some bugfixing might be needed Martin Bauernfeind 2022-07-03 21:14:33 +0200
  • 463de5b1ed Ported the updatePbc method to cuda Martin Bauernfeind 2022-07-03 19:53:33 +0200
  • 4a32a62a98 🐛 Fixed some bugs - neighborhood computation now seems to be quite fast Martin Bauernfeind 2022-06-26 20:19:59 +0200
  • 16e8b76012 Added debug output to find memory leak Martin Bauernfeind 2022-06-26 19:43:10 +0200
  • 60ed524dd8 Fixed various compiler errors - now there's probably a memory leak remaining Martin Bauernfeind 2022-06-26 18:37:09 +0200
  • 45f83c7607 Fixed some struct declaration mistakes Martin Bauernfeind 2022-06-26 17:52:09 +0200
  • c49278cb21 First crude attempt at parallelizing neighborhood computation (only the part after binning the atoms is parallelized with cuda) Martin Bauernfeind 2022-06-26 16:25:59 +0200
  • 757d4329f3 Added a rough sketch for the next steps of porting neighborhood computation to cuda Martin Bauernfeind 2022-06-23 23:58:15 +0200
  • 67f9c769ef Fixing errors - hopefully it works this time Martin Bauernfeind 2022-06-23 22:25:55 +0200
  • b5b4d23c0c 🐛 further refactoring fixing Martin Bauernfeind 2022-06-23 19:46:29 +0200
  • fea1e41daa 🐛 further refactoring step fixing Martin Bauernfeind 2022-06-23 19:43:36 +0200
  • f1998b7acc 🐛 further refactor step fixing Martin Bauernfeind 2022-06-23 19:39:36 +0200
  • 2fe3cd80a0 🐛 further refactor step fixing Martin Bauernfeind 2022-06-23 19:36:59 +0200
  • f4313f64e5 ♻️ further refactoring step fixing Martin Bauernfeind 2022-06-23 19:34:16 +0200
  • 7f068a6959 ♻️ Fixing refactoring step Martin Bauernfeind 2022-06-23 19:32:09 +0200
  • 62cfc22856 ♻️ Refactoring: pulled definition of the GPU atom and neighbor representation from force.cu and put it into main Martin Bauernfeind 2022-06-23 18:54:56 +0200
  • e4d7faf91b Adjust cutforce and atom positions in stubbed version Rafael Ravedutti 2022-05-14 01:02:08 +0200
  • bbdcaf2983 New stubbed version Rafael Ravedutti 2022-05-14 00:55:33 +0200
  • 14838389ff Fix stubbed variant for LAMMPS algorithm Rafael Ravedutti 2022-04-30 04:08:18 +0200
  • ab2eb1ff50 Write LAMMPS kernel with SIMD intrinsics and implement AVX512 with double-precision functions Rafael Ravedutti 2022-04-05 02:57:23 +0200
  • af1756bfe4 Fix skin for Argon simulation Rafael Ravedutti 2022-04-04 22:22:35 +0200
  • 4d11c5a3c2 Merge branch 'master' of github.com:RRZE-HPC/MD-Bench Rafael Ravedutti 2022-04-04 21:52:47 +0200
  • e48b3fb653 Add option to check if cj is local before applying reaction force Rafael Ravedutti 2022-04-04 21:52:40 +0200
  • 7a0d6479a1 Merge branch 'master' of https://github.com/RRZE-HPC/MD-Bench Jan Eitzinger 2022-04-01 15:58:05 +0200
  • 5585ebcf42 Add ONEAPI config. Remove omp simd for full neigh. Jan Eitzinger 2022-04-01 15:57:54 +0200
  • fdbeed4368 Fix AVX2 versions with half neighbor lists Rafael Ravedutti 2022-03-27 16:39:39 +0200