AOB

School project for optimizing calculations, the example used is a fluid simulator. By DRISSI Reda and HASAN JUSUF Yusnanda - IATIC4 ISTY

PREREQUESITES

You need

openjdk-x-jdk switch the x with your version
swing
gcc with openmp support

MAKING AND BUILDING

If your libgomp.so is elsewhere you can find it with :

find / \( -path /mnt -o -path /media -o -path /home -o -path /tmp -o -path /cache \) -prune -o -name "libgomp.so" 2>/dev/null

No need to search for it in tmp mnt media home root unless you have a reason to think it's there (please don't)

You can build the project with make
then execute it with make run
At runtime we added the export LD_LIBRARY_PATH to avoid having to do it each time you open a terminal

TRYING OUT DIFFERENT OPTIS

Currently our targets are

nopti for the basic code with no optimizations
fast for compilation time optimizations
pinc for the pre increment optimization
muldiv for mul/div switching
inl inlining build_index and setBoundry
omp openMP integration
fin all of the above You can execute this command in case of repetitive testing for fastest results

   make clean && make <target> && make run

Using make with no parameters is going to launch the highest level of optimization

CURRENT IMPROVEMENTS

Using openmp which is self explanatory
Using inline functions because calling functions is more time consuming than copying them
Using Ofast mostly for -ftree-vectorize and -frename-registers and -ffast-math
Using -mavx2 then changing the c constant to 1/c so that we can change division into multiplication
because division does not benefit from AVX

USELESS IMPROVEMENTS

using a variable to store the build_index(i,j,grid_size) then just adding subtracting the needed value in order to avoid computing the core value each time had no real performance improvement.
In setBoundry getting the if statement that is independent of the iterations out of the core function had no real added value either.
Dividing the long calculations to avoid mixing add and mul hassles.
using pre increment instead of post increment (in case gcc doesn't handle that which it most probably does)
unrolling loops even manually is useless because iterations are dependent

IMPROVEMENT IDEAS

Using intel intrinsic fonctions in order to use sse registers (better result than -mavx2 in theory)
Verifying whether build_index and setBoundry functions have room for improvements.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
graphics		graphics
.gitignore		.gitignore
FluidSolver.java		FluidSolver.java
Makefile		Makefile
README.md		README.md
WebStart.java		WebStart.java
applet.policy		applet.policy
demo.html		demo.html
fast.tsv		fast.tsv
fin.tsv		fin.tsv
fluid.c		fluid.c
fluid_inl.c		fluid_inl.c
fluid_muldiv.c		fluid_muldiv.c
fluid_nopti.c		fluid_nopti.c
fluid_omp.c		fluid_omp.c
fluid_pinc.c		fluid_pinc.c
fluid_test.c		fluid_test.c
inl.tsv		inl.tsv
interface_c_java.swig		interface_c_java.swig
med.csv		med.csv
med.sh		med.sh
muldiv.tsv		muldiv.tsv
nopti.tsv		nopti.tsv
omp.tsv		omp.tsv
pinc.tsv		pinc.tsv
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AOB

PREREQUESITES

MAKING AND BUILDING

TRYING OUT DIFFERENT OPTIS

CURRENT IMPROVEMENTS

USELESS IMPROVEMENTS

IMPROVEMENT IDEAS

About

Releases

Packages

Contributors 2

Languages

DrissiReda/AOB

Folders and files

Latest commit

History

Repository files navigation

AOB

PREREQUESITES

MAKING AND BUILDING

TRYING OUT DIFFERENT OPTIS

CURRENT IMPROVEMENTS

USELESS IMPROVEMENTS

IMPROVEMENT IDEAS

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages