Walberla performance tracking ticket #4989

RudolfWeeber · 2024-08-27T12:33:37Z

more selective ghost communication LB: avoid double communicaiton #4921
Use AVX vectorization in all kernels (streaming, boundaries, reset force, ...). (It might be useful to automate the generatoin of kernel_traits.hpp, first (Simplify work with walberla kernels #4988)
replace
switch to pull scheme and use combined stream-collide sweep (postponed, as I couldn't get our patched bounce back boundary condiont working with pull scheme)
Generate the packinfos using PyStencils and use vectorization (currently, we use hand-written pack infos form Walberla propper). In particular, do such generated packinfos outperform the currenty used memcpy pack info on the GPU
Multi-GPu simulatoin with cCuda-aware MPI

RudolfWeeber · 2024-08-29T06:01:36Z

Just realized that we did not yet vectorize the streaming kernel. This is probably a relatively low hanging fruit for performance on CPU.

jngrad changed the title ~~Walberla performance trcking ticket~~ Walberla performance tracking ticket Aug 27, 2024

Provide feedback