1.4.0
- Modernize cmake build system.
ProvideVexCL::OpenCL
,VexCL::Compute
,VexCL::CUDA
,VexCL::JIT
imported targets, so that users may justto build a program using the corresponding VexCL backend.add_executable(myprogram myprogram.cpp) target_link_libraries(myprogram VexCL::OpenCL)
Also stop polluting global cmake namespace with things like
add_definitions()
,include_directories()
, etc.
See http://vexcl.readthedocs.io/en/latest/cmake.html. - Make
vex::backend::kernel::config()
return reference to the kernel. So
that it is possible to config and launch the kernel in a single line:
K.config(nblocks, nthreads)(queue, prm1, prm2, prm3);
. - Implement
vector<T>::reinterpret<U>()
method. It returns a new vector that
reinterprets the same data (no copies are made) as the new type. - Implemented new backend: JIT. The backend generates and compiles at runtime
C++ kernels with OpenMP support. The code will not be more effective that
hand-written OpenMP code, but allows to easily debug the generated code with
host-side debugger. The backend also may be used to develop and test new code
when other backends are not available. - Let
VEX_CONSTANTS
to be casted to their values in the host code. So that a
constant defined withVEX_CONSTANT(name, expr)
could be used in host code
asname
. Constants are still useable in vector expressions asname()
. - Allow passing generated kernel args for each GPU (#202).
Kernel args packed into std::vector will be unpacked and passed
to the generated kernels on respective devices. - Reimplemented
vex::SpMat
asvex::sparse::ell
,vex::sparse::crs
,
vex::sparse::matrix
(automatically chooses one of the two formats based on
the current compute device), andvex::sparse::distributed<format>
(this one
may span several compute devices). The new matrix-vector products are now
normal vector expressions, while the oldvex::SpMat
could only be used in
additive expressions. The old implementation is still available.
vex::sparse::ell
is now converted from host-side CRS format on compute
device, which makes the conversion faster. - Bug fixes and minor improvements.