Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update method docstrings #607

Merged
merged 8 commits into from
Sep 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions docs/src/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,10 +78,11 @@ pad_zeros

`NNlib.conv` supports complex datatypes on CPU and CUDA devices.

!!! AMDGPU MIOpen supports only cross-correlation (flipkernel=true).
Therefore for every regular convolution (flipkernel=false)
!!! note "AMDGPU MIOpen supports only cross-correlation (`flipkernel=true`)."

Therefore for every regular convolution (`flipkernel=false`)
kernel is flipped before calculation.
For better performance, use cross-correlation (flipkernel=true)
For better performance, use cross-correlation (`flipkernel=true`)
and manually flip the kernel before `NNlib.conv` call.
`Flux` handles this automatically, this is only required for direct calls.

Expand Down
50 changes: 25 additions & 25 deletions src/activations.jl
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ The ascii name `sigmoid` is also exported.

See also [`sigmoid_fast`](@ref).

```
```julia-repl
julia> using UnicodePlots

julia> lineplot(sigmoid, -5, 5, height=7)
Expand Down Expand Up @@ -63,7 +63,7 @@ const sigmoid = σ

Piecewise linear approximation of [`sigmoid`](@ref).

```
```julia-repl
julia> lineplot(hardsigmoid, -5, 5, height=7)
┌────────────────────────────────────────┐
1 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⢀⡠⠖⠋⠉⠉⠉⠉⠉⠉⠉⠉│ hardσ(x)
Expand Down Expand Up @@ -102,7 +102,7 @@ const hardsigmoid = hardσ

Return `log(σ(x))` which is computed in a numerically stable way.

```
```julia-repl
julia> lineplot(logsigmoid, -5, 5, height=7)
┌────────────────────────────────────────┐
0 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⡧⠤⠔⠒⠒⠒⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉│ logσ(x)
Expand All @@ -128,7 +128,7 @@ Segment-wise linear approximation of `tanh`, much cheaper to compute.
See ["Large Scale Machine Learning"](https://ronan.collobert.com/pub/matos/2004_phdthesis_lip6.pdf).

See also [`tanh_fast`](@ref).
```
```julia-repl
julia> lineplot(hardtanh, -2, 2, height=7)
┌────────────────────────────────────────┐
1 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⣀⠔⠋⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉│ hardtanh(x)
Expand Down Expand Up @@ -164,7 +164,7 @@ hardtanh(x) = clamp(x, oftype(x, -1), oftype(x, 1)) # clamp(x, -1, 1) is type-s
[Rectified Linear Unit](https://en.wikipedia.org/wiki/Rectifier_(neural_networks))
activation function.

```
```julia-repl
julia> lineplot(relu, -2, 2, height=7)
┌────────────────────────────────────────┐
2 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⠔⠋│ relu(x)
Expand All @@ -188,7 +188,7 @@ Leaky [Rectified Linear Unit](https://en.wikipedia.org/wiki/Rectifier_(neural_ne
activation function.
You can also specify the coefficient explicitly, e.g. `leakyrelu(x, 0.01)`.

```julia
```julia-repl
julia> lineplot(x -> leakyrelu(x, 0.5), -2, 2, height=7)
┌────────────────────────────────────────┐
2 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⠤⠒⠉│ #42(x)
Expand Down Expand Up @@ -220,7 +220,7 @@ const leakyrelu_a = 0.01 # also used in gradient below
activation function capped at 6.
See ["Convolutional Deep Belief Networks"](https://www.cs.toronto.edu/~kriz/conv-cifar10-aug2010.pdf) from CIFAR-10.

```
```julia-repl
julia> lineplot(relu6, -10, 10, height=7)
┌────────────────────────────────────────┐
6 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⠎⠉⠉⠉⠉⠉⠉⠉⠉│ relu6(x)
Expand All @@ -245,7 +245,7 @@ Randomized Leaky Rectified Linear Unit activation function.
See ["Empirical Evaluation of Rectified Activations"](https://arxiv.org/abs/1505.00853)
You can also specify the bound explicitly, e.g. `rrelu(x, 0.0, 1.0)`.

```julia
```julia-repl
julia> lineplot(rrelu, -20, 10, height=7)
┌────────────────────────────────────────┐
10 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢸⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⡤⠖⠋│ rrelu(x)
Expand Down Expand Up @@ -275,7 +275,7 @@ Exponential Linear Unit activation function.
See ["Fast and Accurate Deep Network Learning by Exponential Linear Units"](https://arxiv.org/abs/1511.07289).
You can also specify the coefficient explicitly, e.g. `elu(x, 1)`.

```
```julia-repl
julia> lineplot(elu, -2, 2, height=7)
┌────────────────────────────────────────┐
2 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⠤⠒⠉│ elu(x)
Expand Down Expand Up @@ -305,7 +305,7 @@ deriv_elu(Ω, α=1) = ifelse(Ω ≥ 0, one(Ω), Ω + oftype(Ω, α))

Activation function from ["Gaussian Error Linear Units"](https://arxiv.org/abs/1606.08415).

```
```julia-repl
julia> lineplot(gelu, -2, 2, height=7)
┌────────────────────────────────────────┐
2 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⡠⠔⠊│ gelu(x)
Expand Down Expand Up @@ -363,7 +363,7 @@ end
Self-gated activation function.
See ["Swish: a Self-Gated Activation Function"](https://arxiv.org/abs/1710.05941).

```
```julia-repl
julia> lineplot(swish, -2, 2, height=7)
┌────────────────────────────────────────┐
2 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⡤│ swish(x)
Expand All @@ -386,7 +386,7 @@ julia> lineplot(swish, -2, 2, height=7)
Hard-Swish activation function.
See ["Searching for MobileNetV3"](https://arxiv.org/abs/1905.02244).

```
```julia-repl
julia> lineplot(hardswish, -2, 5, height = 7)
┌────────────────────────────────────────┐
5 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⡤⠔⠒⠉│ hardswish(x)
Expand Down Expand Up @@ -430,7 +430,7 @@ deriv_hardswish(x) = ifelse(x < -3, oftf(x,0), ifelse(x > 3, oftf(x,1), x/3 + of
Activation function from
["LiSHT: Non-Parametric Linearly Scaled Hyperbolic Tangent ..."](https://arxiv.org/abs/1901.05894)

```
```julia-repl
julia> lineplot(lisht, -2, 2, height=7)
┌────────────────────────────────────────┐
2 │⠢⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⠔│ lisht(x)
Expand Down Expand Up @@ -469,7 +469,7 @@ lisht(x) = x * tanh_fast(x)
Scaled exponential linear units.
See ["Self-Normalizing Neural Networks"](https://arxiv.org/abs/1706.02515).

```
```julia-repl
julia> lineplot(selu, -3, 2, height=7)
┌────────────────────────────────────────┐
3 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ selu(x)
Expand Down Expand Up @@ -507,7 +507,7 @@ end

Activation function from ["Continuously Differentiable Exponential Linear Units"](https://arxiv.org/abs/1704.07483).

```
```julia-repl
julia> lineplot(celu, -2, 2, height=7)
┌────────────────────────────────────────┐
2 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⠤⠒⠉│ celu(x)
Expand Down Expand Up @@ -535,7 +535,7 @@ deriv_celu(Ω, α=1) = ifelse(Ω > 0, oftf(Ω, 1), Ω / oftf(Ω, α) + 1)
Threshold gated rectified linear activation function.
See ["Zero-bias autoencoders and the benefits of co-adapting features"](https://arxiv.org/abs/1402.3337)

```
```julia-repl
julia> lineplot(trelu, -2, 4, height=7)
┌────────────────────────────────────────┐
4 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⡤⠖⠋│ trelu(x)
Expand All @@ -559,7 +559,7 @@ const thresholdrelu = trelu

See ["Quadratic Polynomials Learn Better Image Features"](http://www.iro.umontreal.ca/~lisa/publications2/index.php/attachments/single/205) (2009).

```
```julia-repl
julia> lineplot(softsign, -5, 5, height=7)
┌────────────────────────────────────────┐
1 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⣀⣀⣀⣀⠤⠤⠤⠤⠤│ softsign(x)
Expand Down Expand Up @@ -602,7 +602,7 @@ deriv_softsign(x) = 1 / (1 + abs(x))^2

See ["Deep Sparse Rectifier Neural Networks"](http://proceedings.mlr.press/v15/glorot11a/glorot11a.pdf), JMLR 2011.

```
```julia-repl
julia> lineplot(softplus, -3, 3, height=7)
┌────────────────────────────────────────┐
4 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ softplus(x)
Expand Down Expand Up @@ -640,7 +640,7 @@ softplus(x) = log1p(exp(-abs(x))) + relu(x)

Return `log(cosh(x))` which is computed in a numerically stable way.

```
```julia-repl
julia> lineplot(logcosh, -5, 5, height=7)
┌────────────────────────────────────────┐
5 │⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ logcosh(x)
Expand All @@ -664,7 +664,7 @@ const log2 = log(2)

Activation function from ["Mish: A Self Regularized Non-Monotonic Neural Activation Function"](https://arxiv.org/abs/1908.08681).

```
```julia-repl
julia> lineplot(mish, -5, 5, height=7)
┌────────────────────────────────────────┐
5 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⡠⠖⠋│ mish(x)
Expand All @@ -686,7 +686,7 @@ mish(x) = x * tanh(softplus(x))

See ["Tanhshrink Activation Function"](https://www.gabormelli.com/RKB/Tanhshrink_Activation_Function).

```
```julia-repl
julia> lineplot(tanhshrink, -3, 3, height=7)
┌────────────────────────────────────────┐
3 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ tanhshrink(x)
Expand All @@ -712,7 +712,7 @@ tanhshrink(x) = x - tanh_fast(x)

See ["Softshrink Activation Function"](https://www.gabormelli.com/RKB/Softshrink_Activation_Function).

```
```julia-repl
julia> lineplot(softshrink, -2, 2, height=7)
┌────────────────────────────────────────┐
2 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀│ softshrink(x)
Expand Down Expand Up @@ -770,7 +770,7 @@ For any other number types, it just calls `tanh`.

See also [`sigmoid_fast`](@ref).

```
```julia-repl
julia> tanh(0.5f0)
0.46211717f0

Expand Down Expand Up @@ -808,11 +808,11 @@ tanh_fast(x::Number) = Base.tanh(x)
sigmoid_fast(x)

This is a faster, and very slightly less accurate, version of `sigmoid`.
For `x::Float32, perhaps 3 times faster, and maximum errors 2 eps instead of 1.
For `x::Float32`, perhaps 3 times faster, and maximum errors 2 eps instead of 1.

See also [`tanh_fast`](@ref).

```
```julia-repl
julia> sigmoid(0.2f0)
0.54983395f0

Expand Down
2 changes: 1 addition & 1 deletion src/audio/mel.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
fmin::Float32 = 0f0, fmax::Float32 = Float32(sample_rate ÷ 2))
Create triangular Mel scale filter banks
(ref: https://en.wikipedia.org/wiki/Mel_scale).
(ref: [Mel scale - Wikipedia](https://en.wikipedia.org/wiki/Mel_scale)).
Each column is a filterbank that highlights its own frequency.
# Arguments:
Expand Down
14 changes: 7 additions & 7 deletions src/audio/stft.jl
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,14 @@
) where T <: Real
Hamming window function
(ref: https://en.wikipedia.org/wiki/Window_function#Hann_and_Hamming_windows).
(ref: [Window function § Hann and Hamming windows - Wikipedia](https://en.wikipedia.org/wiki/Window_function#Hann_and_Hamming_windows)).
Generalized version of `hann_window`.
``w[n] = \\alpha - \\beta cos(\\frac{2 \\pi n}{N - 1})``
``w[n] = \\alpha - \\beta \\cos(\\frac{2 \\pi n}{N - 1})``
Where ``N`` is the window length.
```julia
```julia-repl
julia> lineplot(hamming_window(100); width=30, height=10)
┌──────────────────────────────┐
1 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡠⠚⠉⠉⠉⠢⡄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
Expand Down Expand Up @@ -72,13 +72,13 @@ end
) where T <: Real
Hann window function
(ref: https://en.wikipedia.org/wiki/Window_function#Hann_and_Hamming_windows).
(ref: [Window function § Hann and Hamming windows - Wikipedia](https://en.wikipedia.org/wiki/Window_function#Hann_and_Hamming_windows)).
``w[n] = \\frac{1}{2}[1 - cos(\\frac{2 \\pi n}{N - 1})]``
``w[n] = \\frac{1}{2}[1 - \\cos(\\frac{2 \\pi n}{N - 1})]``
Where ``N`` is the window length.
```julia
```julia-repl
julia> lineplot(hann_window(100); width=30, height=10)
┌──────────────────────────────┐
1 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣠⠚⠉⠉⠉⠢⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
Expand Down Expand Up @@ -138,7 +138,7 @@ Short-time Fourier transform (STFT).
The STFT computes the Fourier transform of short overlapping windows of the input,
giving frequency components of the signal as they change over time.
``Y[\\omega, m] = \\sum_{k = 0}^{N - 1} \\text{window}[k] \\text{input}[m \\times \\text{hop length} + k] exp(-j \\frac{2 \\pi \\omega k}{\\text{n fft}})``
``Y[\\omega, m] = \\sum_{k = 0}^{N - 1} \\text{window}[k] \\text{input}[m \\times \\text{hop length} + k] \\exp(-j \\frac{2 \\pi \\omega k}{\\text{n fft}})``
where ``N`` is the window length,
``\\omega`` is the frequency ``0 \\le \\omega < \\text{n fft}``
Expand Down
3 changes: 2 additions & 1 deletion src/ctc.jl
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@ function logaddexp(a, b)
end

"""
add_blanks(z)
add_blanks(z)
Adds blanks to the start and end of `z`, and between items in `z`
"""
function add_blanks(z, blank)
Expand Down
2 changes: 1 addition & 1 deletion src/dim_helpers/DepthwiseConvDims.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
DepthwiseConvDims
Concrete subclass of `ConvDims` for a depthwise convolution. Differs primarily due to
characterization by C_in, C_mult, rather than C_in, C_out. Useful to be separate from
characterization by `C_in`, `C_mult`, rather than `C_in`, `C_out`. Useful to be separate from
DenseConvDims primarily for channel calculation differences.
"""
struct DepthwiseConvDims{N, K, S, P, D} <: ConvDims{N}
Expand Down
2 changes: 1 addition & 1 deletion src/dim_helpers/PoolDims.jl
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""
PoolDims(x_size::NTuple{M}, k::Union{NTuple{L, Int}, Int};
stride=k, padding=0, dilation=1) where {M, L}
stride=k, padding=0, dilation=1) where {M, L}
Dimensions for a "pooling" operation that can have an arbitrary input size, kernel size,
stride, dilation, and channel count. Used to dispatch onto efficient implementations at
Expand Down
2 changes: 1 addition & 1 deletion src/dropout.jl
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ i.e. each row of a matrix is either zero or not.
Optional first argument is the random number generator used.
# Examples
```
```julia-repl
julia> dropout(ones(2, 10), 0.2)
2×10 Matrix{Float64}:
1.25 1.25 0.0 1.25 1.25 1.25 1.25 1.25 1.25 1.25
Expand Down
4 changes: 2 additions & 2 deletions src/pooling.jl
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,7 @@ Perform mean pool operation with window size `k` on input tensor `x`.
Arguments:
* `x` and `k`: Expects `ndim(x) ∈ 3:5``, and always `length(k) == ndim(x) - 2`
* `x` and `k`: Expects `ndim(x) ∈ 3:5`, and always `length(k) == ndim(x) - 2`
* `pad`: See [`pad_zeros`](@ref) for details.
* `stride`: Either a tuple with the same length as `k`, or one integer for all directions. Default is `k`.
"""
Expand All @@ -182,7 +182,7 @@ This pooling operator from [Learned-Norm Pooling for Deep Feedforward and Recurr
Arguments:
* `x` and `k`: Expects `ndim(x) ∈ 3:5``, and always `length(k) == ndim(x) - 2`
* `x` and `k`: Expects `ndim(x) ∈ 3:5`, and always `length(k) == ndim(x) - 2`
* `p` is restricted to `0 < p < Inf`.
* `pad`: See [`pad_zeros`](@ref) for details.
* `stride`: Either a tuple with the same length as `k`, or one integer for all directions. Default is `k`.
Expand Down
2 changes: 1 addition & 1 deletion src/softmax.jl
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Note that, when used with Flux.jl, `softmax` must not be passed to layers like `
which accept an activation function. The activation is broadcasted over the result,
thus applies to individual numbers. But `softmax` always needs to see the whole column.
```julia
```julia-repl
julia> using Flux
julia> x = randn(Float32, 4, 4, 3, 13);
Expand Down
2 changes: 1 addition & 1 deletion src/utils.jl
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ pass it an array whose gradient is of interest.
There is also an overload for ForwardDiff.jl's `Dual` types (and arrays of them).
# Examples
```
```julia-repl
julia> using ForwardDiff, Zygote, NNlib
julia> f_good(x) = if NNlib.within_gradient(x)
Expand Down
Loading