Error in the docs and in combining layers #329

marcobonici · 2024-09-21T02:42:38Z

I need to train a Normalizing flow on some samples and then use it as a distribution.
@Red-Portal suggested me to use Bijectors.jl. However, when trying to follow the example in the documentation, I noticed there is an error in the documentation.

julia> sb = stack(ibs...) # == Stacked(ibs) == Stacked(ibs, [i:i for i = 1:length(ibs)]
ERROR: MethodError: no method matching length(::Inverse{Bijectors.Logit{Float64, Float64}})

Closest candidates are:
  length(!Matched::LibGit2.GitBlob)
   @ LibGit2 /opt/hostedtoolcache/julia/1.10.5/x64/share/julia/stdlib/v1.10/LibGit2/src/blob.jl:3
  length(!Matched::LibGit2.GitStatus)
   @ LibGit2 /opt/hostedtoolcache/julia/1.10.5/x64/share/julia/stdlib/v1.10/LibGit2/src/status.jl:21
  length(!Matched::DataStructures.DiBitVector)
   @ DataStructures ~/.julia/packages/DataStructures/95DJa/src/dibit_vector.jl:40

For my use case, that is not important, but I thought it was worth mentioning.

So, I tried to use a PlanarLayer on my use case, but it did not work: the training was performed, but the result was not satisfying. Then, as suggested by the tutorial, I tried to compose a couple of layers, to see it would have improved performance...and I got this error, when getting to the training

ArgumentError: broadcasting over dictionaries and `NamedTuple`s is reserved

Stacktrace:
 [1] broadcastable(::@NamedTuple{w::Vector{Float64}, u::Vector{Float64}, b::Vector{Float64}})
   @ Base.Broadcast ./broadcast.jl:744
 [2] broadcasted
   @ ./broadcast.jl:1345 [inlined]
 [3] (::var"#1#2")(θ::PlanarLayer{Vector{Float64}, Vector{Float64}}, ∇::@NamedTuple{w::Vector{Float64}, u::Vector{Float64}, b::Vector{Float64}})
   @ Main ./In[2]:31
 [4] (::Base.var"#4#5"{var"#1#2"})(a::Tuple{PlanarLayer{Vector{Float64}, Vector{Float64}}, @NamedTuple{w::Vector{Float64}, u::Vector{Float64}, b::Vector{Float64}}})
   @ Base ./generator.jl:36
 [5] iterate(::Base.Generator{Vector{Any}, IRTools.Inner.var"#52#53"{IRTools.Inner.var"#54#55"{IRTools.Inner.Block}}})
   @ Base ./generator.jl:47 [inlined]
 [6] collect(itr::Base.Generator{Base.Iterators.Zip{Tuple{@NamedTuple{outer::PlanarLayer{Vector{Float64}, Vector{Float64}}, inner::PlanarLayer{Vector{Float64}, Vector{Float64}}}, Tuple{@NamedTuple{w::Vector{Float64}, u::Vector{Float64}, b::Vector{Float64}}, @NamedTuple{w::Vector{Float64}, u::Vector{Float64}, b::Vector{Float64}}}}}, Base.var"#4#5"{var"#1#2"}})
   @ Base ./array.jl:834
 [7] map(::Function, ::@NamedTuple{outer::PlanarLayer{Vector{Float64}, Vector{Float64}}, inner::PlanarLayer{Vector{Float64}, Vector{Float64}}}, ::Tuple{@NamedTuple{w::Vector{Float64}, u::Vector{Float64}, b::Vector{Float64}}, @NamedTuple{w::Vector{Float64}, u::Vector{Float64}, b::Vector{Float64}}})
   @ Base ./abstractarray.jl:3406
 [8] top-level scope
   @ In[2]:30

Here below is the MWE, to reproduce the error. I am doing what is suggested in the docs, but maybe I misinterpreted something...? Thanks in advance for your help!

using Zygote
using Bijectors
using Functors
b = PlanarLayer(2) ∘ PlanarLayer(2)
using Functors
θs, reconstruct = Functors.functor(b);
       
struct NLLObjective{R,D,T}
    reconstruct::R
    basedist::D
    data::T
end


function (obj::NLLObjective)(θs...)
    transformed_dist = transformed(obj.basedist, obj.reconstruct(θs))
    return -sum(Base.Fix1(logpdf, transformed_dist), eachcol(obj.data))
end

xs = randn(2, 1000);

f = NLLObjective(reconstruct, MvNormal(2, 1), xs);
       
@info "Initial loss: $(f(θs...))"

ε = 1e-3;

for i in 1:100
    ∇s = Zygote.gradient(f, θs...)
    θs = map(θs, ∇s) do θ, ∇
        θ - ε .* ∇
    end
end
@info "Finall loss: $(f(θs...))"
       
samples = rand(transformed(f.basedist, f.reconstruct(θs)), 1000);

mean(eachcol(samples)) # ≈ [0, 0]

cov(samples; dims=2)   # ≈ I

The text was updated successfully, but these errors were encountered:

torfjelde · 2024-09-23T10:41:30Z

Thanks for raising these @marcobonici ! Both are now addressed in #330 .

For the first issue, this is a mistake where stack has been removed (due to being introduced in julia's Base so our impl ended up being type piracy).

For your second issue, the following works just fine:)

using Zygote, Bijectors, Functors
b = PlanarLayer(2) ∘ PlanarLayer(2)

θs, reconstruct = Functors.functor(b);
       
struct NLLObjective{R,D,T}
    reconstruct::R
    basedist::D
    data::T
end


# CHANGED: `θs...` -> `θs` so we maintain the same container-structure.
function (obj::NLLObjective)(θs)
    transformed_dist = transformed(obj.basedist, obj.reconstruct(θs))
    return -sum(Base.Fix1(logpdf, transformed_dist), eachcol(obj.data))
end

xs = randn(2, 1000);

f = NLLObjective(reconstruct, MvNormal(2, 1), xs);
       
@info "Initial loss: $(f(θs))"

# CHANGED: `1e-3` -> `1e-4` because I ran into divergences when using `1e-3`.
ε = 1e-4;

for i in 1:200
    # CHANGED: `θs... -> θs` again due to above.
    # CHANGED: `Zygote.gradient` returns a tuple of length equal to number of args,
    #   so we need to unpack the first component to get the gradient of `θs`.
    (∇s,) = Zygote.gradient(f, θs)
    # CHANGED: `map` -> `fmap` so we can map over nested `NamedTuple`s, etc.
    θs = fmap(θs, ∇s) do θ, ∇
        θ - ε .* ∇
    end
end
@info "Finall loss: $(f(θs))"
       
samples = rand(transformed(f.basedist, f.reconstruct(θs)), 1000);

mean(eachcol(samples)) # ≈ [0, 0]

cov(samples; dims=2)   # ≈ I

Basically, the example in the docs is not really the correct way to do things but it just happened to work out 😬

To be compatible with nested transformations, etc. we need to use fmap instead of map (this wasn't an issue before because we curried the namedtuple θs... when passing it to the loss + it was a single layer so the resulting gradient was a flat namedtuple; now that we do nested transforms, this is no longer the case).

example as reported in #329

* Fixed bug in docs as reported in #329 * Removed usage of removed `stack` in favour of `Stacked` in docs example as reported in #329

torfjelde added a commit that referenced this issue Sep 23, 2024

Fixed bug in docs as reported in #329

bc92e9a

torfjelde mentioned this issue Sep 23, 2024

Fixed bug in docs as reported in #329 #330

Merged

torfjelde added a commit that referenced this issue Sep 23, 2024

Removed usage of removed stack in favour of Stacked in docs

26063cd

example as reported in #329

torfjelde added a commit that referenced this issue Sep 23, 2024

Fixed bug in docs as reported in #329 (#330)

8463fd6

* Fixed bug in docs as reported in #329 * Removed usage of removed `stack` in favour of `Stacked` in docs example as reported in #329

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in the docs and in combining layers #329

Error in the docs and in combining layers #329

marcobonici commented Sep 21, 2024 •

edited by torfjelde

Loading

torfjelde commented Sep 23, 2024 •

edited

Loading

Error in the docs and in combining layers #329

Error in the docs and in combining layers #329

Comments

marcobonici commented Sep 21, 2024 • edited by torfjelde Loading

torfjelde commented Sep 23, 2024 • edited Loading

marcobonici commented Sep 21, 2024 •

edited by torfjelde

Loading

torfjelde commented Sep 23, 2024 •

edited

Loading