Release v0.12.0 - Adds f16 dtype · coreylowman/dfdx

Breaking changes

[Breaking] Adding Tensor::try_realize, and Tensor::realize no longer returns Result by @coreylowman in #758
[Breaking] ReshapeTo::reshape_like and ReshapeTo::try_reshape_like now panic instead of returning option by @coreylowman in #766
[Breaking] Adding dilation/groups to Conv2D. Adding dilation to Pool2D by @coreylowman in #767
[Breaking] Use gemm for matmul. Removes support for matrixmultiply & MKL by @coreylowman in #776
[Breaking] Moving storage GAT to trait level generic. Split DeviceStorage into multiple traits by @coreylowman in #782
[Breaking] Adding dilation/groups to ConvTranspose2D by @coreylowman in #783

What's Changed

Adding f16 as Dtype by @coreylowman in #696
Adding example by @sirandreww in #740
Adds TryConcatAlong to support Concat along any axis by @coreylowman in #750
Changed CUDA_ARCH in compatibility.cuh by @jafioti in #752
Allow broadcast_like to accept tensors OR shapes by @VasanthakumarV in #751
Removing rerun build.rs for output destination by @coreylowman in #754
Fixing compatibility for compute cap 70-75 by @coreylowman in #757
Adds TriangleTensor and CmpKernel traits to Device bound by @coreylowman in #760
Using Bernoulli distribution in dropout - makes dropout reproducible across dtypes by @coreylowman in #761
Fixes bug with f16 mean where number of elements reduced was f16::INF by @coreylowman in #763
Placeholder f16 gemm speedups by @coreylowman in #765
MultiHeadAttention 3d impl now broadcasts to 4d instead of duplicating logic by @coreylowman in #768
Moving cudarc?/f16 behind f16 feature by @coreylowman in #774
impl Clone for Adam, SGD, RMSprop by @coreylowman in #775
Properly setting read_dst for gemm in forward/backward pass by @coreylowman in #777
Adds rayon dependency. Using gemm::Parallelism::Rayon(rayon::current_num_threads()) by @coreylowman in #778
Add LogSoftmax by @kurnevsky in #769
Moving some tests off nightly. Adding docs to conv2d op by @coreylowman in #779
Adding better error messages if nvidia-smi/nvcc are not found by @coreylowman in #784
Using for loop with gridDim.x * blockDim.x as increment by @coreylowman in #787
Removing __hmax and __hmin compat functions by @coreylowman in #788
Uses grid striding in fill_with by @coreylowman in #790
Exposed NumpyDType publicly by @jafioti in #791
Fixing weight shape for grouped Conv2D by @coreylowman in #797
Bump half/cudarc versions by @coreylowman in #805
Using Groups in conv weight init by @coreylowman in #806
Add scalar support to TensorCollection by @nkoppel in #799

New Contributors

@sirandreww made their first contribution in #740
@kurnevsky made their first contribution in #769

Full Changelog: v0.11.2...v0.12.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.12.0 - Adds f16 dtype

Breaking changes

What's Changed

New Contributors

Contributors