TejeshPala
|
7ee250161a
|
omp_get_max_threads instead of omp_get_num_threads for gcc compiler adaption
Signed-off-by: TejeshPala <tejesh.pala@fau.de>
|
2024-01-13 15:09:03 +01:00 |
|
TejeshPala
|
4cfa664533
|
schedule options for force kernels and to print in main fn
Signed-off-by: TejeshPala <tejesh.pala@fau.de>
|
2024-01-11 17:09:18 +01:00 |
|
TejeshPala
|
c4e5e87265
|
omp print threads
|
2023-11-21 15:31:27 +01:00 |
|
Rafael Ravedutti
|
151f0c0e6f
|
Add extendend param option
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2023-05-29 02:27:32 +02:00 |
|
Rafael Ravedutti
|
c438fc6832
|
Fix GROMACS AVX2 code
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2023-04-07 21:54:07 +02:00 |
|
Rafael Ravedutti
|
039de0be99
|
Fix stubbed versions and debug messages
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2023-03-30 03:49:57 +02:00 |
|
Rafael Ravedutti
|
43259eb3cf
|
Adjust neighbor lists layout to keep neighbor ids contiguous in memory
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2023-03-30 01:57:26 +02:00 |
|
Rafael Ravedutti
|
3eb7170a65
|
Adapt stubbed version for new neighbor lists in GROMACS
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2023-03-29 21:54:33 +02:00 |
|
Rafael Ravedutti
|
59145644e3
|
Last changes to 2xnn kernels
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2023-03-28 23:34:07 +02:00 |
|
Rafael Ravedutti
|
b15aa2f461
|
Optimize 4xn kernels
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2023-03-28 23:00:21 +02:00 |
|
Rafael Ravedutti
|
5c000444a4
|
Pre-compute masks for 4xn kernels
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2023-03-28 22:30:30 +02:00 |
|
Rafael Ravedutti
|
04ade6bcec
|
Pre-compute masks for 2xnn kernel with full neighbor-lists
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2023-03-28 19:33:26 +02:00 |
|
Rafael Ravedutti
|
85f1484449
|
Specialize force kernel when there are no masks to be checked
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2023-03-28 18:04:18 +02:00 |
|
Rafael Ravedutti
|
965fda3879
|
Pre-compute masks in the same way as in the master branch
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2023-03-28 17:32:42 +02:00 |
|
Rafael Ravedutti
|
a86d214c73
|
Add working version with old masking
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2023-03-28 02:19:46 +02:00 |
|
Rafael Ravedutti
|
d138f975f6
|
Add diagonal checks
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2023-03-23 02:17:27 +01:00 |
|
Rafael Ravedutti
|
296a4c4e01
|
Set interaction masks as gromacs does
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2023-03-23 00:58:25 +01:00 |
|
Yannick Paschke
|
c61cf9a0ac
|
Move likwid marker calls into OpenMP parallel region
|
2023-01-22 15:33:05 +01:00 |
|
Rafael Ravedutti
|
07f2f74561
|
Adjust force_iters stats for 4xN kernel
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2023-01-02 23:57:51 +01:00 |
|
Rafael Ravedutti
|
fe86c948a8
|
Adjust time and likwid measurements on 4xN kernels
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2023-01-02 14:19:59 +01:00 |
|
Rafael Ravedutti
|
15d43dcce5
|
Explicitly set half_neigh to zero on stubbed versions
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-12-14 17:21:09 +01:00 |
|
Rafael Ravedutti
|
fa4e38c6c4
|
Add IACA and stubbed measurements for GROMACS 4x8 FN kernel
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-11-18 01:00:20 +01:00 |
|
Rafael Ravedutti
|
04ea1b027e
|
Print kernel and precision info in gromacs-stub
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-11-16 16:15:15 +01:00 |
|
Rafael Ravedutti
|
56d9613028
|
Implement stubbed version for GROMACS
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-11-15 16:01:13 +01:00 |
|
Rafael Ravedutti
|
f293cec960
|
Call CPU version of updatePbc within setupPbc
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-11-14 19:19:57 +01:00 |
|
Rafael Ravedutti
|
6eedf1776e
|
Small fixes into GROMACS GPU code
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-11-14 18:21:14 +01:00 |
|
Rafael Ravedutti
|
93188d1383
|
Adjust NVCC flags to avoid issues with atomicAdd with doubles
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-11-14 18:01:46 +01:00 |
|
Rafael Ravedutti
|
c70ebce4c1
|
Integrate GROMACS GPU implementation into master branch
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-11-08 18:33:23 +01:00 |
|
Rafael Ravedutti
|
493915fe95
|
Fix code for AVX and remove warnings
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-11-08 15:30:37 +01:00 |
|
Jan Eitzinger
|
3d0f4b97ee
|
Switch copyright header in source files.
|
2022-09-05 10:39:42 +02:00 |
|
Rafael Ravedutti
|
28d3946072
|
Move common modules to common directory
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-08-17 17:56:31 +02:00 |
|
Rafael Ravedutti
|
47db9e86b0
|
Introduce common directory
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-08-17 17:20:57 +02:00 |
|
Rafael Ravedutti
|
29fa08fa7f
|
Enhance output for gromacs variant
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-08-16 19:32:49 +02:00 |
|
Rafael Ravedutti
|
911ba63336
|
Adjust ISA options and improve output
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-08-16 18:36:47 +02:00 |
|
Rafael Ravedutti
|
2e77f6207b
|
Avoid errors when compiling for AVX2 due to SIMD LJ implementation
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-07-19 02:30:26 +02:00 |
|
Rafael Ravedutti
|
ab2eb1ff50
|
Write LAMMPS kernel with SIMD intrinsics and implement AVX512 with double-precision functions
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-04-05 02:57:23 +02:00 |
|
Rafael Ravedutti
|
e48b3fb653
|
Add option to check if cj is local before applying reaction force
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-04-04 21:52:40 +02:00 |
|
Rafael Ravedutti
|
fdbeed4368
|
Fix AVX2 versions with half neighbor lists
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-03-27 16:39:39 +02:00 |
|
Rafael Ravedutti
|
0e742766b7
|
Add working version of Simd4xn kernel with half neighbor lists
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-03-23 15:54:18 +01:00 |
|
Rafael Ravedutti
|
e72323ab6a
|
Fix Simd2xnn Kernel with half neighbor lists and add AVX512 intrinsics with double
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-03-23 15:21:07 +01:00 |
|
Rafael Ravedutti
|
94521f03b3
|
Fix reference version with half neighbor lists
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-03-23 14:31:47 +01:00 |
|
Rafael Ravedutti
|
8709bc2a06
|
Add first version for half neighbor lists in GROMACS variant
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-03-22 23:47:05 +01:00 |
|
Rafael Ravedutti
|
887f41871c
|
Add parameter reading for LAMMPS variant
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-03-17 02:44:34 +01:00 |
|
Rafael Ravedutti
|
d4b34e1fa4
|
Fix intrinsics for AVX2
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-03-17 00:35:21 +01:00 |
|
Rafael Ravedutti
|
4090f43095
|
Optimize partial forces reduction for compute_4xn kernel
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-03-16 17:54:52 +01:00 |
|
Rafael Ravedutti
|
f3263a2d48
|
Separate simd file into multiple files
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-03-16 14:52:55 +01:00 |
|
Rafael Ravedutti
|
d47173d7a2
|
Fix Simd2xNN kernel
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-03-15 19:59:10 +01:00 |
|
Rafael Ravedutti
|
d61576699d
|
Add first compilable version of Gromacs with SP
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-03-15 02:40:56 +01:00 |
|
Rafael Ravedutti
|
8669f2f6d7
|
Fix LJ Simd4xN kernel
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-03-11 01:12:59 +01:00 |
|
Rafael Ravedutti
|
d79c3c2a1d
|
Add first working version with 4x8 config (ref kernel)
Signed-off-by: Rafael Ravedutti <rafaelravedutti@gmail.com>
|
2022-03-10 22:33:41 +01:00 |
|