0e742766b7Add working version of Simd4xn kernel with half neighbor lists
Rafael Ravedutti
2022-03-23 15:54:18 +01:00
e72323ab6aFix Simd2xnn Kernel with half neighbor lists and add AVX512 intrinsics with double
Rafael Ravedutti
2022-03-23 15:21:07 +01:00
94521f03b3Fix reference version with half neighbor lists
Rafael Ravedutti
2022-03-23 14:31:47 +01:00
8709bc2a06Add first version for half neighbor lists in GROMACS variant
Rafael Ravedutti
2022-03-22 23:47:05 +01:00
2a555a7debAdd simd reduction pragma to vectorize innermost loop on half-neighbor variant
Rafael Ravedutti
2022-03-21 17:02:09 +01:00
719330807bChange data layout for force arrays according to position
Rafael Ravedutti
2022-03-18 01:40:51 +01:00
e7737e9151Refactor half neighbor lists code
Rafael Ravedutti
2022-03-18 01:28:11 +01:00
5df544637fFix force calculation time in LAMMPS variant
Rafael Ravedutti
2022-03-17 02:53:58 +01:00
887f41871cAdd parameter reading for LAMMPS variant
Rafael Ravedutti
2022-03-17 02:44:34 +01:00
d4b34e1fa4Fix intrinsics for AVX2
Rafael Ravedutti
2022-03-17 00:35:21 +01:00
4090f43095Optimize partial forces reduction for compute_4xn kernel
Rafael Ravedutti
2022-03-16 17:54:52 +01:00
f3263a2d48Separate simd file into multiple files
Rafael Ravedutti
2022-03-16 14:52:55 +01:00
459853dc25Merge pull request #4 from RRZE-HPC/gromacs_sp
rafaelravedutti
2022-03-15 20:31:42 +01:00
d47173d7a2Fix Simd2xNN kernel
Rafael Ravedutti
2022-03-15 19:59:10 +01:00
d61576699dAdd first compilable version of Gromacs with SP
Rafael Ravedutti
2022-03-15 02:40:56 +01:00
8669f2f6d7Fix LJ Simd4xN kernel
Rafael Ravedutti
2022-03-11 01:12:59 +01:00
d79c3c2a1dAdd first working version with 4x8 config (ref kernel)
Rafael Ravedutti
2022-03-10 22:33:41 +01:00
c2fcd50773Initial version of lammps halfneighbor list
Jan Eitzinger
2022-03-10 17:06:45 +01:00
ba3a0524f6Merge branch 'master' of github.com:RRZE-HPC/MD-Bench
Jan Eitzinger
2022-03-10 16:30:40 +01:00
6203cb12b6Start to introduce halfneigh version
Jan Eitzinger
2022-03-10 16:30:37 +01:00
22d0f0b958Commit version that works for M=N
Rafael Ravedutti
2022-03-10 01:31:50 +01:00
2b441e691eMake code compilable
Rafael Ravedutti
2022-03-09 17:23:49 +01:00
c7360305c8Add first draft version of GROMACS method separating i-clusters and j-clusters
Rafael Ravedutti
2022-03-09 02:25:39 +01:00
cecb31d6a9Update params for argon_1000 test case
Rafael Ravedutti
2022-03-07 14:49:38 +01:00
ba6785a865Allow parameter reading from files and update data
Rafael Ravedutti
2022-03-05 03:21:52 +01:00
aae29a5b5aAdd code to read GRO files
Rafael Ravedutti
2022-03-03 20:03:33 +01:00
af92800c64Add SIMD version with AVX (no AVX2) and XTC output
Rafael Ravedutti
2022-03-02 23:12:04 +01:00
022aa75c75Add cutoff radius and skin as parameters of simulation
Rafael Ravedutti
2022-02-28 22:34:42 +01:00
1389f89fb7Add prunning kernel
Rafael Ravedutti
2022-02-28 17:20:39 +01:00
c62e4ea4adAdd clusters efficiency on stats
Rafael Ravedutti
2022-02-28 16:10:09 +01:00
ed2929c813Add percentage of atoms within cutoff radius when using LAMMPS reference version
Rafael Ravedutti
2022-02-25 14:40:33 +01:00
e637a26844Add percentage of atoms within cutoff radius when using GROMACS reference version
Rafael Ravedutti
2022-02-25 14:19:48 +01:00
fdd18df816Fix argon simulation
Rafael Ravedutti
2022-02-24 16:42:58 +01:00
1a708f2d3bAdd PDB reading functions to lammps variant
Rafael Ravedutti
2022-02-24 15:17:51 +01:00
d0ec9520f2Write function to read PDB files and include data for Argon simulation
Rafael Ravedutti
2022-02-24 02:36:17 +01:00
ca7775a62aAdd average atoms per cluster on stats
Rafael Ravedutti
2022-02-09 17:50:54 +01:00
769bab0faaSeparate local and ghost cluster edges on VTK output
Rafael Ravedutti
2022-02-08 16:12:22 +01:00
6a35a7a482Update stats for cluster version
Rafael Ravedutti
2022-02-08 00:55:27 +01:00
8deee3d954Add cluster edges in VTK output
Rafael Ravedutti
2022-02-08 00:11:10 +01:00
cd15911a97When building neighbor lists, skip first iterations until z is in range
Rafael Ravedutti
2022-02-07 18:28:53 +01:00
0eacb2453eInline getBoundingBoxDistanceSq and avoid redundant loads from bbminz and bbmaxz
Rafael Ravedutti
2022-02-07 18:00:21 +01:00
b024adaf5bRe-measure for 2000 time steps
mucosim_cuda
Maximilian Gaul
2022-02-04 17:40:12 +01:00
cdb1d5b9f1Add version with AVX2 intrinsics for gromacs scheme
Rafael Ravedutti
2022-02-04 17:52:48 +01:00
34ce407f18Update stats for gromacs scheme
Rafael Ravedutti
2022-02-04 14:47:37 +01:00
6e6a3f6502Use aligned loads when gathering j atoms
Rafael Ravedutti
2022-02-04 14:29:32 +01:00
7b90800a2bSetting forces to zero before calculation is not required
Rafael Ravedutti
2022-02-04 14:05:04 +01:00
9daf9e5f4dFix exclusion masks and add SIMD debug tools
Rafael Ravedutti
2022-02-02 21:54:18 +01:00
4c5f013bf4Assign masked adds results to forces
Rafael Ravedutti
2022-02-02 18:07:56 +01:00
6ad1e58a3eAdd first kernel using SIMD instrinsics for 4xn cases
Rafael Ravedutti
2022-02-02 18:00:44 +01:00
5fd2d422eeAdjust kernels to work with MxN loops
Rafael Ravedutti
2022-02-02 00:49:55 +01:00
85e7954932Check all clusters in cell when building neighbor lists because ghost clusters may not be sorted
Rafael Ravedutti
2022-02-01 20:16:04 +01:00
4a5216a177Remove bb z-check on while loop when building neighbor lists
Rafael Ravedutti
2022-02-01 00:46:12 +01:00
e64c3345bcFix a few more bugs on gromacs variant
Rafael Ravedutti
2022-01-31 23:46:20 +01:00
696e6da01dImplement Neighbour list AoS memory layout + performance measurement
Maximilian Gaul
2022-01-31 20:27:59 +01:00
e0e6b6a68cPerform a few fixes for gromacs variant
Rafael Ravedutti
2022-01-31 17:49:22 +01:00
6691803910Add first version of force calculation with cluster scheme
Rafael Ravedutti
2022-01-28 18:07:41 +01:00
eedcc97e4aRemove segfaults
Rafael Ravedutti
2022-01-28 15:18:54 +01:00
a119fcdfddFix some segfaults and add function to update single atoms
Rafael Ravedutti
2022-01-27 03:07:31 +01:00
aa0f4048d0Rename default directory to lammps and reorganize gromacs variant steps
Rafael Ravedutti
2022-01-25 21:00:11 +01:00
cbe42b8149Fix errors to make gromacs approach compilable so far
Rafael Ravedutti
2022-01-25 12:19:28 +01:00
6291709ae7Add first draft code with GROMACS approach
Rafael Ravedutti
2022-01-25 00:43:10 +01:00
b2a6574426Remove unnecessary atom force backcopy in computeForce
Maximilian Gaul
2022-01-24 18:09:27 +01:00
c4080e866eMake integrate kernels aware of neighbour list update
Maximilian Gaul
2022-01-24 18:04:50 +01:00
72730bc27bUpdate Makefile and config.mk
Rafael Ravedutti
2022-01-17 14:16:39 +01:00
df09c2861eAdd first version with more than one optimization scheme
Rafael Ravedutti
2022-01-17 14:15:02 +01:00
489e7ee9d3Update .gitignore
Rafael Ravedutti
2022-01-17 11:46:57 +01:00
165335cea0Update compilation flags for all available compilers
Rafael Ravedutti
2022-01-17 11:40:44 +01:00
7b592b5fc7Moved presentation resources to second presentation
Maximilian Gaul
2022-01-05 12:48:37 +01:00
4690542db5Added CPU metrics {Cache, FLOPS, L2, L3}, restructured resource folders
Maximilian Gaul
2022-01-05 12:31:47 +01:00
8c131a7699Reminder for likwid perf measurements
Maximilian Gaul
2022-01-04 13:51:53 +01:00
dc4d5f1a9cPorting atom velocity memory layout to AoS, porting velocity integration to CUDA, adding measurements + logbook update
Maximilian Gaul
2022-01-01 18:18:12 +01:00
50007216edImplemented atom force AoS memory layout, added performance measurements + logbook Update
Maximilian Gaul
2022-01-01 16:09:21 +01:00
72e4599accCopy neighbour lists only when reneighbouring happens, added measurements + logbook update
Maximilian Gaul
2022-01-01 12:56:42 +01:00
8fa03733e9Copy parameters & cutforces threshold only once at the start + measurements
Maximilian Gaul
2021-12-28 16:48:26 +01:00
bf1ae3d013Removed debug prints, only zero atom forces and not copy them, added measurements
Maximilian Gaul
2021-12-28 16:32:54 +01:00
8009b54113Trying to debug segfault if cudaMemcpy is limited to neighbour list update
Maximilian Gaul
2021-12-25 15:36:08 +01:00
0ea0587442Only malloc once at the beginning plus measurement csv
Maximilian Gaul
2021-12-25 13:52:33 +01:00
134e3f4b78Also pinnend neighbor-struct memory, added additional performance measurements, added nvprof result to logbook
Maximilian Gaul
2021-12-18 15:58:56 +01:00
c2bfa3ca3fAdd scripts for perf measurement, made atom-memory allocation pinnend using 'cudaMallocHost', added measurements for atom pinnend memory
Maximilian Gaul
2021-12-18 13:02:04 +01:00
2a099da5b7Started cuda profiling, added first result to logbook
Maximilian Gaul
2021-12-03 08:13:43 +01:00
7691b23d67Measure memory transfer of CPU to GPU, add explanation how to distribute calculation among multiple GPUs
Maximilian Gaul
2021-12-01 17:16:32 +01:00
35c110155eSeparate tracing from force computation and fix stubbed version
Rafael Ravedutti
2021-12-01 00:07:45 +01:00
bb21a885a1Add new setups for Copper melting with LJ and EAM
Rafael Ravedutti
2021-11-30 01:33:55 +01:00
da90466f98Added first performance measurements with threads per block from 1 to 32
Maximilian Gaul
2021-11-25 08:09:20 +01:00
8f723c1299Added command line description of MD-Bench, added memory transfer rate from CPU to GPU to force.cu
Maximilian Gaul
2021-11-23 15:55:23 +01:00
0586ef150aFix num of threads instead of num of blocks, add logbook template
Maximilian Gaul
2021-11-15 19:39:09 +01:00
2e5d973f7dRough rewrite to execute outer loop of force calculation in parallel, not inner loop
Maximilian Gaul
2021-11-14 10:02:23 +01:00
e2fd1a0476Fixed bug, results are now equal to master branch (but still slow)
Maximilian Gaul
2021-11-11 21:00:30 +01:00
4105c844c6Runs fine (but slow), results seem to be slightly off from original
Maximilian Gaul
2021-11-11 20:47:06 +01:00
1f5c9c4b23Fixed segfault error, added more cudaErrorChecks, added cudaFree to avoid memory leak
Maximilian Gaul
2021-11-11 20:29:14 +01:00
29e115464bFixed cudaMemcpy for AOS data layout, added debug outputs, added cudaErrorChecks
Maximilian Gaul
2021-11-11 20:14:30 +01:00
1a54314c8bFirst run but segfault at the moment after a few seconds
Maximilian Gaul
2021-11-11 15:23:46 +01:00
280f595b7fFixed linker error by putting includes and cuda function in extern 'C'
Maximilian Gaul
2021-11-11 14:49:29 +01:00
3428974730getTimeStamp() couldn't get linked
Maximilian Gaul
2021-11-11 08:03:56 +01:00
b54842f764Added Makefile instructions for .cu files
Maximilian Gaul
2021-11-11 07:27:12 +01:00
9730164e6fRename force.c to force.cu because of cuda build errors
Maximilian Gaul
2021-11-10 16:20:04 +01:00
0f5fdd3708Sum results after cuda function executed
Maximilian Gaul
2021-11-10 16:02:05 +01:00
f7010113bfInclude commented timestamping on asm
Rafael Ravedutti
2021-11-10 14:39:44 +01:00
841dfb9490Fix data types for rdr and rdrho
Rafael Ravedutti
2021-11-09 20:36:23 +01:00
3f7fb7f22acudaMemcpy of Atom and other properties, first draft implementation of CUDA kernel
Maximilian Gaul
2021-11-09 16:40:25 +01:00