d47173d7a2
Fix Simd2xNN kernel
Rafael Ravedutti
2022-03-15 19:59:10 +01:00
d61576699d
Add first compilable version of Gromacs with SP
Rafael Ravedutti
2022-03-15 02:40:56 +01:00
8669f2f6d7
Fix LJ Simd4xN kernel
Rafael Ravedutti
2022-03-11 01:12:59 +01:00
d79c3c2a1d
Add first working version with 4x8 config (ref kernel)
Rafael Ravedutti
2022-03-10 22:33:41 +01:00
c2fcd50773
Initial version of lammps halfneighbor list
Jan Eitzinger
2022-03-10 17:06:45 +01:00
ba3a0524f6
Merge branch 'master' of github.com:RRZE-HPC/MD-Bench
Jan Eitzinger
2022-03-10 16:30:40 +01:00
6203cb12b6
Start to introduce halfneigh version
Jan Eitzinger
2022-03-10 16:30:37 +01:00
22d0f0b958
Commit version that works for M=N
Rafael Ravedutti
2022-03-10 01:31:50 +01:00
2b441e691e
Make code compilable
Rafael Ravedutti
2022-03-09 17:23:49 +01:00
c7360305c8
Add first draft version of GROMACS method separating i-clusters and j-clusters
Rafael Ravedutti
2022-03-09 02:25:39 +01:00
cecb31d6a9
Update params for argon_1000 test case
Rafael Ravedutti
2022-03-07 14:49:38 +01:00
ba6785a865
Allow parameter reading from files and update data
Rafael Ravedutti
2022-03-05 03:21:52 +01:00
aae29a5b5a
Add code to read GRO files
Rafael Ravedutti
2022-03-03 20:03:33 +01:00
af92800c64
Add SIMD version with AVX (no AVX2) and XTC output
Rafael Ravedutti
2022-03-02 23:12:04 +01:00
022aa75c75
Add cutoff radius and skin as parameters of simulation
Rafael Ravedutti
2022-02-28 22:34:42 +01:00
1389f89fb7
Add prunning kernel
Rafael Ravedutti
2022-02-28 17:20:39 +01:00
c62e4ea4ad
Add clusters efficiency on stats
Rafael Ravedutti
2022-02-28 16:10:09 +01:00
ed2929c813
Add percentage of atoms within cutoff radius when using LAMMPS reference version
Rafael Ravedutti
2022-02-25 14:40:33 +01:00
e637a26844
Add percentage of atoms within cutoff radius when using GROMACS reference version
Rafael Ravedutti
2022-02-25 14:19:48 +01:00
fdd18df816
Fix argon simulation
Rafael Ravedutti
2022-02-24 16:42:58 +01:00
1a708f2d3b
Add PDB reading functions to lammps variant
Rafael Ravedutti
2022-02-24 15:17:51 +01:00
d0ec9520f2
Write function to read PDB files and include data for Argon simulation
Rafael Ravedutti
2022-02-24 02:36:17 +01:00
ca7775a62a
Add average atoms per cluster on stats
Rafael Ravedutti
2022-02-09 17:50:54 +01:00
769bab0faa
Separate local and ghost cluster edges on VTK output
Rafael Ravedutti
2022-02-08 16:12:22 +01:00
6a35a7a482
Update stats for cluster version
Rafael Ravedutti
2022-02-08 00:55:27 +01:00
8deee3d954
Add cluster edges in VTK output
Rafael Ravedutti
2022-02-08 00:11:10 +01:00
cd15911a97
When building neighbor lists, skip first iterations until z is in range
Rafael Ravedutti
2022-02-07 18:28:53 +01:00
0eacb2453e
Inline getBoundingBoxDistanceSq and avoid redundant loads from bbminz and bbmaxz
Rafael Ravedutti
2022-02-07 18:00:21 +01:00
b024adaf5b
Re-measure for 2000 time steps
mucosim_cuda
Maximilian Gaul
2022-02-04 17:40:12 +01:00
cdb1d5b9f1
Add version with AVX2 intrinsics for gromacs scheme
Rafael Ravedutti
2022-02-04 17:52:48 +01:00
34ce407f18
Update stats for gromacs scheme
Rafael Ravedutti
2022-02-04 14:47:37 +01:00
6e6a3f6502
Use aligned loads when gathering j atoms
Rafael Ravedutti
2022-02-04 14:29:32 +01:00
7b90800a2b
Setting forces to zero before calculation is not required
Rafael Ravedutti
2022-02-04 14:05:04 +01:00
9daf9e5f4d
Fix exclusion masks and add SIMD debug tools
Rafael Ravedutti
2022-02-02 21:54:18 +01:00
4c5f013bf4
Assign masked adds results to forces
Rafael Ravedutti
2022-02-02 18:07:56 +01:00
6ad1e58a3e
Add first kernel using SIMD instrinsics for 4xn cases
Rafael Ravedutti
2022-02-02 18:00:44 +01:00
5fd2d422ee
Adjust kernels to work with MxN loops
Rafael Ravedutti
2022-02-02 00:49:55 +01:00
85e7954932
Check all clusters in cell when building neighbor lists because ghost clusters may not be sorted
Rafael Ravedutti
2022-02-01 20:16:04 +01:00
4a5216a177
Remove bb z-check on while loop when building neighbor lists
Rafael Ravedutti
2022-02-01 00:46:12 +01:00
e64c3345bc
Fix a few more bugs on gromacs variant
Rafael Ravedutti
2022-01-31 23:46:20 +01:00
696e6da01d
Implement Neighbour list AoS memory layout + performance measurement
Maximilian Gaul
2022-01-31 20:27:59 +01:00
e0e6b6a68c
Perform a few fixes for gromacs variant
Rafael Ravedutti
2022-01-31 17:49:22 +01:00
6691803910
Add first version of force calculation with cluster scheme
Rafael Ravedutti
2022-01-28 18:07:41 +01:00
eedcc97e4a
Remove segfaults
Rafael Ravedutti
2022-01-28 15:18:54 +01:00
a119fcdfdd
Fix some segfaults and add function to update single atoms
Rafael Ravedutti
2022-01-27 03:07:31 +01:00
aa0f4048d0
Rename default directory to lammps and reorganize gromacs variant steps
Rafael Ravedutti
2022-01-25 21:00:11 +01:00
cbe42b8149
Fix errors to make gromacs approach compilable so far
Rafael Ravedutti
2022-01-25 12:19:28 +01:00
6291709ae7
Add first draft code with GROMACS approach
Rafael Ravedutti
2022-01-25 00:43:10 +01:00
b2a6574426
Remove unnecessary atom force backcopy in computeForce
Maximilian Gaul
2022-01-24 18:09:27 +01:00
c4080e866e
Make integrate kernels aware of neighbour list update
Maximilian Gaul
2022-01-24 18:04:50 +01:00
72730bc27b
Update Makefile and config.mk
Rafael Ravedutti
2022-01-17 14:16:39 +01:00
df09c2861e
Add first version with more than one optimization scheme
Rafael Ravedutti
2022-01-17 14:15:02 +01:00
489e7ee9d3
Update .gitignore
Rafael Ravedutti
2022-01-17 11:46:57 +01:00
165335cea0
Update compilation flags for all available compilers
Rafael Ravedutti
2022-01-17 11:40:44 +01:00
7b592b5fc7
Moved presentation resources to second presentation
Maximilian Gaul
2022-01-05 12:48:37 +01:00
4690542db5
Added CPU metrics {Cache, FLOPS, L2, L3}, restructured resource folders
Maximilian Gaul
2022-01-05 12:31:47 +01:00
8c131a7699
Reminder for likwid perf measurements
Maximilian Gaul
2022-01-04 13:51:53 +01:00
dc4d5f1a9c
Porting atom velocity memory layout to AoS, porting velocity integration to CUDA, adding measurements + logbook update
Maximilian Gaul
2022-01-01 18:18:12 +01:00
50007216ed
Implemented atom force AoS memory layout, added performance measurements + logbook Update
Maximilian Gaul
2022-01-01 16:09:21 +01:00
72e4599acc
Copy neighbour lists only when reneighbouring happens, added measurements + logbook update
Maximilian Gaul
2022-01-01 12:56:42 +01:00
8fa03733e9
Copy parameters & cutforces threshold only once at the start + measurements
Maximilian Gaul
2021-12-28 16:48:26 +01:00
bf1ae3d013
Removed debug prints, only zero atom forces and not copy them, added measurements
Maximilian Gaul
2021-12-28 16:32:54 +01:00
8009b54113
Trying to debug segfault if cudaMemcpy is limited to neighbour list update
Maximilian Gaul
2021-12-25 15:36:08 +01:00
0ea0587442
Only malloc once at the beginning plus measurement csv
Maximilian Gaul
2021-12-25 13:52:33 +01:00
134e3f4b78
Also pinnend neighbor-struct memory, added additional performance measurements, added nvprof result to logbook
Maximilian Gaul
2021-12-18 15:58:56 +01:00
c2bfa3ca3f
Add scripts for perf measurement, made atom-memory allocation pinnend using 'cudaMallocHost', added measurements for atom pinnend memory
Maximilian Gaul
2021-12-18 13:02:04 +01:00
2a099da5b7
Started cuda profiling, added first result to logbook
Maximilian Gaul
2021-12-03 08:13:43 +01:00
7691b23d67
Measure memory transfer of CPU to GPU, add explanation how to distribute calculation among multiple GPUs
Maximilian Gaul
2021-12-01 17:16:32 +01:00
35c110155e
Separate tracing from force computation and fix stubbed version
Rafael Ravedutti
2021-12-01 00:07:45 +01:00
bb21a885a1
Add new setups for Copper melting with LJ and EAM
Rafael Ravedutti
2021-11-30 01:33:55 +01:00
da90466f98
Added first performance measurements with threads per block from 1 to 32
Maximilian Gaul
2021-11-25 08:09:20 +01:00
8f723c1299
Added command line description of MD-Bench, added memory transfer rate from CPU to GPU to force.cu
Maximilian Gaul
2021-11-23 15:55:23 +01:00
0586ef150a
Fix num of threads instead of num of blocks, add logbook template
Maximilian Gaul
2021-11-15 19:39:09 +01:00
2e5d973f7d
Rough rewrite to execute outer loop of force calculation in parallel, not inner loop
Maximilian Gaul
2021-11-14 10:02:23 +01:00
e2fd1a0476
Fixed bug, results are now equal to master branch (but still slow)
Maximilian Gaul
2021-11-11 21:00:30 +01:00
4105c844c6
Runs fine (but slow), results seem to be slightly off from original
Maximilian Gaul
2021-11-11 20:47:06 +01:00
1f5c9c4b23
Fixed segfault error, added more cudaErrorChecks, added cudaFree to avoid memory leak
Maximilian Gaul
2021-11-11 20:29:14 +01:00
29e115464b
Fixed cudaMemcpy for AOS data layout, added debug outputs, added cudaErrorChecks
Maximilian Gaul
2021-11-11 20:14:30 +01:00
1a54314c8b
First run but segfault at the moment after a few seconds
Maximilian Gaul
2021-11-11 15:23:46 +01:00
280f595b7f
Fixed linker error by putting includes and cuda function in extern 'C'
Maximilian Gaul
2021-11-11 14:49:29 +01:00
3428974730
getTimeStamp() couldn't get linked
Maximilian Gaul
2021-11-11 08:03:56 +01:00
b54842f764
Added Makefile instructions for .cu files
Maximilian Gaul
2021-11-11 07:27:12 +01:00
9730164e6f
Rename force.c to force.cu because of cuda build errors
Maximilian Gaul
2021-11-10 16:20:04 +01:00
0f5fdd3708
Sum results after cuda function executed
Maximilian Gaul
2021-11-10 16:02:05 +01:00
f7010113bf
Include commented timestamping on asm
Rafael Ravedutti
2021-11-10 14:39:44 +01:00
841dfb9490
Fix data types for rdr and rdrho
Rafael Ravedutti
2021-11-09 20:36:23 +01:00
3f7fb7f22a
cudaMemcpy of Atom and other properties, first draft implementation of CUDA kernel
Maximilian Gaul
2021-11-09 16:40:25 +01:00