Skip to content

Releases: ttsiodras/MandelbrotSSE

20% speedup, and control of reused pixel percent and max iterations

16 Jul 11:01
Compare
Choose a tag to compare

You can now control:

  • the percentage of pixels actually computed per frame, with option -p. If you e.g. pass -p 0.5, then 100-0.5 = 99.5% of the pixels will be copied from the previous frame, and only 0.5% will be actually derived through the Mandelbrot computations.
  • the maximum number of iterations in the Mandelbrot loop (option -i). By default this is set to 2048 to allow for decent zoom levels, but if you want to see insane speeds, set this to something low, like 128, and disable the frame limiter; i.e. use options -f 0 -i 128.

Also, the per-scaline computational load is very unevenly distributed; so OpenMP dynamic scheduling is now used, to make the best use of multiple cores. Result: 20% speedup.

AVX, SDL2, GCC FMV... Speed!

14 Jul 20:14
Compare
Choose a tag to compare
  • Run-time dispatching to inline assembly AVX/SSE or normal code. CoreLoopDoubleAVX is 80% faster than SSE :-)
  • Also puts GCC's FMV feature to use - creating AVX/SSE/normal versions of many functions and dispatching at run-time to the right one.
  • Added benchmarking mode (command line option -b)
  • Migrated to libSDL2 - no more video tearing.
  • The loop reading the SDL events queue no longer "buffers" events.
  • This means that under the hood, we now use OpenGL for rendering - the window appropriately reacts to resizing.

For ASCII-art (aalib/caca) we need SDL_Quit at the end.

03 Jun 18:31
Compare
Choose a tag to compare

After compiling SDL with --enable-video-caca and using SDL_VIDEODRIVER=caca ./src/mandelSSE I could not see the frame rate at the end. The stdio restoration at the end needed a call to SDL_Quit.

Minor fix - Intel SSE was misdetected.

03 Jun 18:10
Compare
Choose a tag to compare

Updated configure.ac to properly detect Intel SSE (previously, it was mis-detected in e.g. ARM machines).

Cleaning things up

15 Jan 23:24
Compare
Choose a tag to compare
  • Default to auto-pilot mode (easier for new users)
  • In auto-pilot mode, randomly choose one of the available zoom targets,
    and then run forever - cycling through all available ones
  • Don't zoom beyond the levels allowed by IEEE754 accuracy (i.e. double precision)
  • Increase XaoS algorithm accuracy for larger windows
  • Minor fixups.

First fully operational release.

01 Nov 23:10
Compare
Choose a tag to compare
  • Improvement in the way autoconf/automake works (e.g. Git versioning)
  • Added OpenMP and SSE support in my version of the XaoS algorithm
  • Added periodicity checking in both the pipelined floating point computations as well as the SSE/SSE2 inline assembly versions
  • Introduced proper getopt-based parsing of the command line arguments
  • Cleaned up the code.

At this point, I am out of ideas on what else to optimize/improve :-)