Ready ?

gdalle · Dec 7, 2023 · 3a20350 · 3a20350
1 parent f3ad31a
commit 3a20350
Show file tree

Hide file tree

Showing 31 changed files with 487 additions and 562 deletions.
diff --git a/Project.toml b/Project.toml
@@ -29,6 +29,7 @@ ChainRulesCore = "1.16"
 DensityInterface = "0.4"
 Distributions = "0.25"
 DocStringExtensions = "0.9"
+FillArrays = "1"
 HMMBase = "1"
 LinearAlgebra = "1.6"
 PrecompileTools = "1.1"

diff --git a/README.md b/README.md
@@ -12,6 +12,22 @@
 
 A Julia package for HMM modeling, simulation, inference and learning.
 
+## Mathematical background
+
+[Hidden Markov Models](https://en.wikipedia.org/wiki/Hidden_Markov_model) are a statistical modeling framework that is ubiquitous in signal processing, bioinformatics and plenty of other fields. They capture the distribution of an observation sequence $(Y_t)$ by assuming the existence of a latent state sequence $(X_t)$ such that:
+
+* the state follows a (discrete time, discrete space) Markov chain $\mathbb{P}_\theta(X_t | X_{t-1})$
+* the observation distribution is determined at each time by the state $\mathbb{P}_\theta(Y_t | X_t)$
+
+Following [Rabiner (1989)](https://ieeexplore.ieee.org/document/18626), we can list several statistical problems, each of which has an efficient solution algorithm that our package implements:
+
+| Problem    | Goal                                                                                                      | Algorithm        |
+| ---------- | --------------------------------------------------------------------------------------------------------- | ---------------- |
+| Evaluation | Likelihood of the observation sequence $\mathbb{P}_\theta(Y_{1:T})$                                       | Forward          |
+| Inference  | State marginals $\mathbb{P}_\theta(X_t \vert Y_{1:T})$                                                    | Forward-backward |
+| Decoding   | Most likely state sequence $\underset{X_{1:T}}{\mathrm{argmax}}~\mathbb{P}_\theta(X_{1:T} \vert Y_{1:T})$ | Viterbi          |
+| Learning   | Best parameter $\underset{\theta}{\mathrm{argmax}}~\mathbb{P}_\theta(Y_{1:T})$                            | Baum-Welch       |
+
 ## Getting started
 
 This package can be installed using Julia's package manager:
@@ -26,7 +42,7 @@ Then, you can create your first HMM as follows:
 using Distributions, HiddenMarkovModels
 init = [0.2, 0.8]
 trans = [0.1 0.9; 0.7 0.3]
-dists = [Normal(-1), Normal(1)]
+dists = [Normal(-1.0), Normal(1.0)]
 hmm = HMM(init, trans, dists)
 ```
 
@@ -35,24 +51,19 @@ Take a look at the [documentation](https://gdalle.github.io/HiddenMarkovModels.j
 ## Main features
 
 This package is **generic**.
-Observations can be arbitrary Julia objects, not just scalars or arrays.
-Their distributions only need to implement `rand(rng, dist)` and `logdensityof(dist, x)` from [DensityInterface.jl](https://github.com/JuliaMath/DensityInterface.jl).
-Number types are not restricted to floating point, and automatic differentiation is supported in forward mode ([ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl)) and reverse mode ([ChainRules.jl](https://github.com/JuliaDiff/ChainRules.jl)).
+Observations can be arbitrary Julia objects, not just scalars or arrays, because their distributions only need to implement `rand(rng, dist)` and `logdensityof(dist, x)` ([DensityInterface.jl](https://github.com/JuliaMath/DensityInterface.jl)).
+Number types are not restricted to floating point, and automatic differentiation is supported in forward mode ([ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl)).
+Time-heterogeneous or controlled HMMs are supported out of the box.
 
 This package is **fast**.
 All the inference functions have allocation-free versions, which leverage efficient linear algebra subroutines.
-Multithreading is used to parallelize computations across sequences, and compatibility with various array types is ensured.
-We include extensive benchmarks against Julia and Python competitors thanks to [BenchmarkTools.jl](https://github.com/JuliaCI/BenchmarkTools.jl) and [PythonCall.jl](https://github.com/cjdoris/PythonCall.jl).
+We will include extensive benchmarks against Julia and Python competitors ([BenchmarkTools.jl](https://github.com/JuliaCI/BenchmarkTools.jl) + [PythonCall.jl](https://github.com/cjdoris/PythonCall.jl)).
 
 This package is **reliable**.
-It gives the same results as the previous reference package [HMMBase.jl](https://github.com/maxmouchet/HMMBase.jl) up to numerical accuracy.
-The test suite incorporates quality checks with [Aqua.jl](https://github.com/JuliaTesting/Aqua.jl), as well as linting and type stability checks with [JET.jl](https://github.com/aviatesk/JET.jl).
+It gives the same results as the previous reference package ([HMMBase.jl](https://github.com/maxmouchet/HMMBase.jl)) up to numerical accuracy.
+The test suite incorporates quality checks ([Aqua.jl](https://github.com/JuliaTesting/Aqua.jl)), as well as type stability analysis ([JET.jl](https://github.com/aviatesk/JET.jl)).
 A detailed documentation will help you find the functions you need.
 
-But this package is **limited in scope**.
-It is designed for HMMs with a small number of states, because memory and runtime scale quadratically (even if the transitions are sparse).
-It is also meant to perform best on a CPU, and not tested at all on GPUs.
-
 ## Contributing
 
 If you spot a bug or want to ask about a new feature, please [open an issue](https://github.com/gdalle/HiddenMarkovModels.jl/issues) on the GitHub repository.

diff --git a/benchmark/HMMBenchmark/src/algos.jl b/benchmark/HMMBenchmark/src/algos.jl
@@ -16,16 +16,13 @@ function rand_gaussian_hmm(; configuration)
     return hmm
 end
 
-function rand_seqs(; configuration)
-    @unpack obs_dim, seq_length, nb_seqs = configuration
-    obs_seqs = [[randn(obs_dim) for _ in 1:seq_length] for _ in 1:nb_seqs]
-    return MultiSeq(obs_seqs)
-end
-
 function benchmarkables_hiddenmarkovmodels(; configuration, algos)
     @unpack seq_length, nb_seqs, bw_iter = configuration
     hmm = rand_gaussian_hmm(; configuration)
-    obs_seqs = MultiSeq([rand(hmm, seq_length).obs_seq for _ in 1:nb_seqs])
+    obs_seqs = [rand(hmm, seq_length).obs_seq for _ in 1:nb_seqs]
+
+    obs_seq = reduce(vcat, obs_seqs)
+    seq_ends = cumsum(length.(obs_seqs))
 
     benchs = Dict()
 
@@ -37,52 +34,55 @@ function benchmarkables_hiddenmarkovmodels(; configuration, algos)
 
     if "logdensity" in algos
         benchs["logdensity"] = @benchmarkable begin
-            logdensityof($hmm, $obs_seqs)
+            logdensityof($hmm, $obs_seq; seq_ends=$seq_ends)
         end
     end
 
     if "forward" in algos
         benchs["forward_init"] = @benchmarkable begin
-            initialize_forward_backward($hmm, $obs_seqs)
+            initialize_forward_backward($hmm, $obs_seq; seq_ends=$seq_ends)
         end
         benchs["forward!"] = @benchmarkable begin
-            forward!(f_storages, $hmm, $obs_seqs)
-        end setup = (f_storages = initialize_forward($hmm, $obs_seqs))
+            forward!(f_storage, $hmm, $obs_seq; seq_ends=$seq_ends)
+        end setup = (f_storage = initialize_forward($hmm, $obs_seq; seq_ends=$seq_ends))
     end
 
     if "viterbi" in algos
         benchs["viterbi_init"] = @benchmarkable begin
-            initialize_viterbi($hmm, $obs_seqs)
+            initialize_viterbi($hmm, $obs_seq; seq_ends=$seq_ends)
         end
         benchs["viterbi!"] = @benchmarkable begin
-            viterbi!(v_storages, $hmm, $obs_seqs)
-        end setup = (v_storages = initialize_viterbi($hmm, $obs_seqs))
+            viterbi!(v_storage, $hmm, $obs_seq; seq_ends=$seq_ends)
+        end setup = (v_storage = initialize_viterbi($hmm, $obs_seq; seq_ends=$seq_ends))
     end
 
     if "forward_backward" in algos
         benchs["forward_backward_init"] = @benchmarkable begin
-            initialize_forward_backward($hmm, $obs_seqs)
+            initialize_forward_backward($hmm, $obs_seq; seq_ends=$seq_ends)
         end
         benchs["forward_backward!"] = @benchmarkable begin
-            forward_backward!(fb_storages, $hmm, $obs_seqs)
-        end setup = (fb_storages = initialize_forward_backward($hmm, $obs_seqs))
+            forward_backward!(fb_storage, $hmm, $obs_seq; seq_ends=$seq_ends)
+        end setup = (
+            fb_storage = initialize_forward_backward($hmm, $obs_seq; seq_ends=$seq_ends)
+        )
     end
 
     if "baum_welch" in algos
-        benchs["baum_welch_init"] = @benchmarkable begin
-            initialize_baum_welch($hmm, $obs_seqs)
-        end
         benchs["baum_welch!"] = @benchmarkable begin
             baum_welch!(
-                bw_storage,
+                fb_storage,
+                logL_evolution,
                 $hmm,
-                $obs_seqs;
+                $obs_seq;
+                seq_ends=$seq_ends,
                 max_iterations=$bw_iter,
                 atol=-Inf,
                 loglikelihood_increasing=false,
             )
         end setup = (
-            bw_storage = initialize_baum_welch($hmm, $obs_seqs; max_iterations=$bw_iter)
+            fb_storage = initialize_forward_backward($hmm, $obs_seq; seq_ends=$seq_ends);
+            logL_evolution = Float64[];
+            sizehint!(logL_evolution, $bw_iter)
         )
     end
 

diff --git a/benchmark/Manifest.toml b/benchmark/Manifest.toml
@@ -169,7 +169,7 @@ version = "0.1.0"
     PythonCall = "6099a3de-0909-46bc-b1f4-468b9a2dfc0d"
 
 [[deps.HiddenMarkovModels]]
-deps = ["ChainRulesCore", "DensityInterface", "DocStringExtensions", "LinearAlgebra", "PrecompileTools", "Random", "Requires", "SimpleUnPack", "SparseArrays", "StatsAPI"]
+deps = ["ChainRulesCore", "DensityInterface", "DocStringExtensions", "FillArrays", "LinearAlgebra", "PrecompileTools", "Random", "Requires", "SimpleUnPack", "SparseArrays", "StatsAPI"]
 path = ".."
 uuid = "84ca31d5-effc-45e0-bfda-5a68cd981f47"
 version = "0.4.0"

diff --git a/docs/make.jl b/docs/make.jl
@@ -42,21 +42,17 @@ function literate_title(path)
 end
 
 pages = [
-    "Essentials" => [
-        "Home" => "index.md",
-        "Background" => "background.md",
-        "Alternatives" => "alternatives.md",
-    ],
+    "Home" => "index.md",
     "Tutorials" => [
         "Basics" => joinpath("examples", "basics.md"),
-        "Distributions" => joinpath("examples", "distributions.md"),
+        "Periodic" => joinpath("examples", "periodic.md"),
         "Controlled" => joinpath("examples", "controlled.md"),
     ],
     "API reference" => "api.md",
     "Advanced" => [
+        "Alternatives" => "alternatives.md",
         "Debugging" => "debugging.md",
         "Formulas" => "formulas.md",
-        "Roadmap" => "roadmap.md",
     ],
 ]
 

diff --git a/docs/src/alternatives.md b/docs/src/alternatives.md
@@ -1,5 +1,7 @@
 # Competitors
 
+## Julia
+
 We compare features among the following Julia packages:
 
 * HiddenMarkovModels.jl (abbreviated to HMMs.jl)
@@ -22,3 +24,9 @@ There are also more generic packages for probabilistic programming, which are ab
 | Logarithmic probabilities | halfway             | halfway             | yes             |
 
 Sim = Simulation, FB = Forward-Backward, Vit = Viterbi, BW = Baum-Welch
+
+## Python
+
+* [hmmlearn](https://github.com/hmmlearn/hmmlearn) (based on NumPy)
+* [pomegrnate](https://github.com/jmschrei/pomegranate) (based on PyTorch)
+* [dynamax](https://github.com/probml/dynamax) (based on JAX)
diff --git a/docs/src/api.md b/docs/src/api.md
@@ -9,19 +9,15 @@ HiddenMarkovModels
 ```@docs
 AbstractHMM
 HMM
-PermutedHMM
 ```
 
 ## Basics
 
 ```@docs
 rand
-length
-eltype
 initialization
 transition_matrix
 obs_distributions
-fit!
 ```
 
 ## Inference
@@ -32,7 +28,16 @@ forward
 viterbi
 forward_backward
 baum_welch
-MultiSeq
+```
+
+## Utils
+
+```@docs
+length
+eltype
+fit!
+HiddenMarkovModels.fit_in_sequence!
+HiddenMarkovModels.seq_limits
 ```
 
 ## Internals
@@ -46,7 +51,6 @@ Do not consider them to be part of the API subject to semantic versioning.
 HiddenMarkovModels.ForwardStorage
 HiddenMarkovModels.ViterbiStorage
 HiddenMarkovModels.ForwardBackwardStorage
-HiddenMarkovModels.BaumWelchStorage
 ```
 
 ### Initializing storage
@@ -55,7 +59,6 @@ HiddenMarkovModels.BaumWelchStorage
 HiddenMarkovModels.initialize_forward
 HiddenMarkovModels.initialize_viterbi
 HiddenMarkovModels.initialize_forward_backward
-HiddenMarkovModels.initialize_baum_welch
 ```
 
 ### Modifying storage
@@ -72,9 +75,6 @@ HiddenMarkovModels.baum_welch!
 ```@docs
 HiddenMarkovModels.rand_prob_vec
 HiddenMarkovModels.rand_trans_mat
-HiddenMarkovModels.project_prob_vec
-HiddenMarkovModels.project_trans_mat
-HiddenMarkovModels.fit_in_sequence!
 HiddenMarkovModels.LightDiagNormal
 HiddenMarkovModels.LightCategorical
 ```

diff --git a/docs/src/background.md b/docs/src/background.md
diff --git a/docs/src/formulas.md b/docs/src/formulas.md
@@ -202,3 +202,8 @@ To sum up,
 \frac{\partial \log \mathcal{L}}{\partial \log b_{j,t}} &= \sum_{i=1}^N \bar{\alpha}_{i,t-1} a_{i,j} \frac{b_{j,t}}{m_t} \bar{\beta}_{j,t} = \frac{\bar{\alpha}_{j,t} \bar{\beta}_{j,t}}{c_t} = \gamma_{j,t}
 \end{align*}
 ```
+
+## Bibliography
+
+```@bibliography
+```
diff --git a/docs/src/roadmap.md b/docs/src/roadmap.md
diff --git a/examples/basics.jl b/examples/basics.jl
@@ -2,6 +2,7 @@
 
 using Distributions
 using HiddenMarkovModels
+import HiddenMarkovModels as HMMs
 using Random
 using Test  #src
 
@@ -48,10 +49,10 @@ hmm_guess = HMM(init_guess, trans_guess, dists_guess)
 #-
 
 obs_seqs = [rand(rng, hmm, rand(T:(2T))).obs_seq for k in 1:100];
-obs_seq_concat = reduce(vcat, obs_seqs)
-seq_ends = cumsum(length.(obs_seqs))
 
-hmm_est, logL_evolution = baum_welch(hmm_guess, obs_seq_concat; seq_ends=seq_ends)
+hmm_est, logL_evolution = baum_welch(
+    hmm_guess, reduce(vcat, obs_seqs); seq_ends=cumsum(length.(obs_seqs))
+)
 
 #-
 
@@ -60,4 +61,4 @@ first(logL_evolution), last(logL_evolution)
 #-
 
 cat(hmm_est.trans, hmm.trans; dims=3)
-@test similar_hmms(hmm_est, hmm_trans; atol=1e-1)  #src
+@test HMMs.similar_hmms(hmm_est, hmm; atol=1e-1)  #src
diff --git a/examples/controlled.jl b/examples/controlled.jl
@@ -100,13 +100,15 @@ hmm_guess = ControlledGaussianHMM(init_guess, trans_guess, dist_coeffs_guess)
 control_seqs = [[randn(rng, 3) for t in 1:rand(T:(2T))] for k in 1:100];
 obs_seqs = [rand(rng, hmm, control_seq).obs_seq for control_seq in control_seqs];
 
-obs_seq_concat = reduce(vcat, obs_seqs);
-control_seq_concat = reduce(vcat, control_seqs);
-seq_ends = cumsum(length.(obs_seqs));
-
 hmm_est, logL_evolution = baum_welch(
-    hmm_guess, obs_seq_concat; control_seq=control_seq_concat, seq_ends
+    hmm_guess,
+    reduce(vcat, obs_seqs);
+    control_seq=reduce(vcat, control_seqs),
+    seq_ends=cumsum(length.(obs_seqs)),
 )
 @test HMMs.similar_hmms(  #src
-    hmm_est, hmm; control_seq=[[1, 0, 0], [0, 1, 0], [0, 0, 1]], atol=0.1  #src
+    hmm_est,  #src
+    hmm;  #src
+    control_seq=[[1, 0, 0], [0, 1, 0], [0, 0, 1]],  #src
+    atol=0.1,  #src
 )  #src