nerfstudio-project · maturk · Oct 6, 2023 · Oct 2, 2023 · Oct 2, 2023 · Oct 3, 2023
diff --git a/docs/requirements.txt b/docs/requirements.txt
@@ -1,4 +1,5 @@
 furo
 sphinx
 sphinx-copybutton
-sphinx-design
+sphinx-design
+sphinxcontrib-bibtex
diff --git a/docs/source/_C/cuda.rst b/docs/source/_C/cuda.rst
@@ -0,0 +1,181 @@
+Cuda Lib
+===================================
+
+.. currentmodule:: diff_rast
+
+
+Some of the important CUDA backend functions are exposed to python with the `cuda` submodule. You can import the CUDA bindings with:
+
+.. code-block:: python
+
+    import diff_rast.cuda as _C
+    help(_C)
+
+The following functions are currently supported:
+
+* _C.rasterize_forward 
+* _C.rasterize_backward
+* _C.compute_cov2d_bounds_forward 
+* _C.project_gaussians_forward
+* _C.project_gaussians_backward
+* _C.compute_sh_forward
+* _C.compute_sh_backward
+* _C.compute_cumulative_intersects 
+* _C.map_gaussian_to_intersects
+
+
+rasterize_forward
+-----------------
+
+.. code-block:: python
+
+    _C.rasterize_forward(*args, **kwargs)
+
+        PARAMETERS:
+            xys: Float[Tensor, "*batch 2"],
+            depths: Float[Tensor, "*batch 1"],
+            radii: Float[Tensor, "*batch 1"],
+            conics: Float[Tensor, "*batch 3"],
+            num_tiles_hit: Int[Tensor, "*batch 1"],
+            colors: Float[Tensor, "*batch channels"],
+            opacity: Float[Tensor, "*batch 1"],
+            img_height: int,
+            img_widt: int,
+            background: Float[Tensor, "channels"]
+
+rasterize_backward
+------------------
+
+.. code-block:: python
+
+    _C.rasterize_backward(*args, **kwargs)
+
+        PARAMETERS:
+            img_height: int,
+            img_width: int,
+            gaussian_ids_sorted: ,
+            tile_bins: Int[Tensor, "x h 1"],
+            xys: Float[Tensor, "*batch 2"],
+            conics: Float[Tensor, "*batch 3"],
+            colors: Float[Tensor, "*batch channels"],
+            opacity: Float[Tensor, "*batch 1"],
+            background: Float[Tensor, "channels"],
+            final_Ts: Float[Tensor, "*batch 1"],
+            final_idx: Float[Tensor, "*batch 1"],
+            v_out_img: Float[Tensor, "*batch channels"]
+
+
+compute_cov2d_bounds_forward
+----------------------------
+
+.. code-block:: python
+
+    _C.compute_cov2d_bounds_forward(*args, **kwargs)
+
+        PARAMETERS:
+            cov2d: Float[Tensor, "*batch 3"]
+
+
+project_gaussians_forward
+-------------------------
+
+.. code-block:: python
+
+    _C.project_gaussians_forward(*args, **kwargs)
+
+        PARAMETERS:
+            num_points: int,
+            means3d: Float[Tensor, "*batch 3"],
+            scales: Float[Tensor, "*batch 3"],
+            glob_scale: int,
+            quats: Float[Tensor, "*batch 4"],
+            viewmat: Float[Tensor, "*batch 4 4"],
+            projmat: Float[Tensor, "*batch 4 4"],
+            fx: float,
+            fy: float,
+            img_height: int,
+            img_width: int,
+            tile_bounds: Int[Tensor, "tiles.x tiles.y 1"],
+            clip_thresh: float,
+
+
+project_gaussians_backward
+--------------------------
+
+.. code-block:: python
+
+    _C.project_gaussians_backward(*args, **kwargs)
+
+        PARAMETERS:
+            num_points int,
+            means3d: Float[Tensor, "*batch 3"],
+            scales: Float[Tensor, "*batch 3"],
+            glob_scale: int,
+            quats: Float[Tensor, "*batch 4"],
+            viewmat:Float[Tensor, "*batch 4 4"],
+            projmat: Float[Tensor, "*batch 4 4"],
+            fx: float,
+            fy: float,
+            img_height: int,
+            img_width: int,
+            cov3d: Int[Tensor, "*batch 5"],
+            radii: Int[Tensor, "*batch 1"],
+            conics: Float[Tensor, "*batch 3"],
+            v_xys: Float[Tensor, "*batch 2"],
+            v_conics: Float[Tensor, "*batch 3"]
+
+
+compute_sh_forward
+------------------
+
+.. code-block:: python
+
+    _C.compute_sh_forward(*args, **kwargs)
+
+        PARAMETERS:
+            num_points: int, 
+            degree: int, 
+            viewdirs: Float[Tensor, "*batch 3"], 
+            coeffs: Float[Tensor, "*batch degree channels"]
+
+
+compute_sh_backward
+-------------------
+
+.. code-block:: python
+
+    _C.compute_sh_backward(*args, **kwargs)
+
+        PARAMETERS:
+            num_points: int, 
+            degree: int, 
+            viewdirs: Float[Tensor, "*batch 3"], 
+            v_colors: Float[Tensor, "*batch channels"]
+
+
+compute_cumulative_intersects
+-----------------------------
+
+.. code-block:: python
+
+    _C.compute_cumulative_intersects(*args, **kwargs)
+
+        PARAMETERS:
+            num_points: int, 
+            num_tiles_hit: Int[Tensor, "*batch 1"]
+
+
+map_gaussian_to_intersects
+--------------------------
+
+.. code-block:: python
+
+    _C.map_gaussian_to_intersects(*args, **kwargs)
+
+        PARAMETERS:
+            num_points: int, 
+            xys: Int[Tensor, "*batch 2"], 
+            depths: Int[Tensor, "*batch 1"], 
+            radii: Int[Tensor, "*batch 1"], 
+            cum_tiles_hit: Int[Tensor, "*batch 1"], 
+            tile_bounds: Int[Tensor, "tiles.x tiles.y 1"]
diff --git a/docs/source/apis/proj.rst b/docs/source/apis/proj.rst
@@ -1,7 +1,43 @@
-Projection
+ProjectGaussians
 ===================================
 
 .. currentmodule:: diff_rast
 
+Given 3D gaussians parametrized by means :math:`μ`, covariances :math:`Σ`, colors :math:`c`, and opacities :math:`o`, the 
+ProjectGaussians function computes the projected 2D gaussians in the camera frame with means :math:`μ'`, covariances :math:`Σ'`, and depths :math:`z`
+as well as their maximum radii in screen space and conic parameters. 
+
+Note, covariances are reparametrized by the eigen decomposition:
+
+.. math::
+
+   Σ = RSS^{T}R^{T}
+
+Where rotation matrices :math:`R` are obtained from four dimensional quaternions.
+
+The projection of 3D Gaussians is approximated with the Jacobian of the perspective projection equation 
+as shown in :cite:p:`zwicker2002ewa`:
+
+.. math::
+
+    J = \begin{bmatrix}
+            f_{x}/t_{z} & 0 & -f_{x} t_{x}/t_{z}^{2} \\
+            0 & f_{y}/t_{z} & -f_{y} t_{y}/t_{z}^{2} \\
+            0 & 0 & 0
+        \end{bmatrix}
+
+Where :math:`t` is the center of a gaussian in camera frame :math:`t = Wμ+p`. The projected 2D covarience is then given by: 
+
+.. math::
+
+    Σ' = J W Σ W^{⊤} J^{⊤}
+
+
+Citations
+-------------
+.. bibliography::
+    :style: unsrt
+    :filter: docname in docnames
+
 .. autoclass:: ProjectGaussians
     :members:
diff --git a/docs/source/apis/rast.rst b/docs/source/apis/rast.rst
@@ -0,0 +1,35 @@
+RasterizeGaussians
+===================================
+
+.. currentmodule:: diff_rast
+
+Given 2D gaussians that are parametrized by their means :math:`μ'` and covariances :math:`Σ'` as well as their radii and conic parameters,
+the RasterizeGaussians function first sorts each gaussian such that all gaussians within the bounds of a tile are grouped and sorted by increasing depth :math:`z`,
+and then renders each pixel within a tile with alpha-compositing. 
+
+The discrete rendering equation is given by: 
+
+.. math::
+
+    \sum_{t=n}^{N}c_{n}·α_{n}·T_{n}
+
+Where 
+
+.. math::
+
+    T_{n} = \prod_{t=m}^{M}(1-α_{m})
+
+And 
+
+.. math::
+
+    α_{n} = o_{n} \exp(-σ_{n})
+
+    σ_{n} = \frac{1}{2} ∆^{⊤}_{n} Σ'^{−1} ∆_{n}
+
+
+:math:`σ ∈ R^{2}` is the Mahalanobis distance (here referred to as sigma) which measures how many standard deviations away the center of a gaussian and the rendered pixel center is which is denoted by delta :math:`∆.`
+
+
+.. autoclass:: RasterizeGaussians
+    :members:
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -18,6 +18,7 @@
     "sphinx.ext.autodoc",
     "sphinx.ext.autosummary",
     "sphinx.ext.intersphinx",
+    "sphinxcontrib.bibtex",
 ]
 
 intersphinx_mapping = {
@@ -40,3 +41,6 @@
 
 # typehints
 autodoc_typehints = "description"
+
+# citations
+bibtex_bibfiles = ["references.bib"]
diff --git a/docs/source/conventions/data_conventions.rst b/docs/source/conventions/data_conventions.rst
@@ -0,0 +1,41 @@
+Data Conventions
+===================================
+
+.. currentmodule:: diff_rast
+
+Here we explain the various conventions used in our repo.
+
+Rotation Convention
+-------------------
+We represent rotations with four dimensional vectors :math:`q = (w,x,y,z)` such that the 3x3 :math:`SO(3)` rotation matrix is defined by:
+
+.. math::
+
+    R = \begin{bmatrix}
+        1 - 2 \left( y^2 + z^2 \right) & 2 \left( x y - w z \right) & 2 \left( x z + w y \right) \\
+        2 \left( x y + w z \right) & 1 - 2 \left( x^2 + z^2 \right) & 2 \left( y z - w x \right) \\
+        2 \left( x z - w y \right) & 2 \left( y z + w x \right) & 1 - 2 \left( x^2 + y^2 \right) \\
+        \end{bmatrix}
+
+View Matrix and Projection Matrix
+---------------------------------
+We refer to the `view matrix` :math:`W` as the world to camera frame transformation (referred to as `w2c` in some sources) that maps
+3D world points :math:`(x,y,z)_{world}` to 3D camera points :math:`(x,y,z)_{cam}` where :math:`z_{cam}` is the relative depth to the camera center.
+
+
+The `projection matrix` refers to the full projective transformation that maps 3D points in the world frame to the 2D points in the image/pixel frame.
+This transformation is the concatenation of the perspective projection matrix :math:`K` (obtained from camera intrinsics) and the view matrix :math:`W`.
+We adopt the `OpenGL <http://www.songho.ca/opengl/gl_projectionmatrix.html>`_ perspective projection convention. The projection matrix :math:`P` is given by:
+
+.. math:: 
+
+    P = K W
+
+.. math::
+
+    K = \begin{bmatrix}
+        \frac{2n}{r - l} & 0.0 & \frac{r + l}{r - l} & 0.0 \\
+        0.0 & \frac{2n}{t - b} & \frac{t + b}{t - b} & 0.0 \\
+        0.0 & 0.0 & \frac{f + n}{f - n} & -\frac{f \cdot n}{f - n} \\
+        0.0 & 0.0 & 1.0 & 0.0 \\
+    \end{bmatrix}
diff --git a/docs/source/examples/simple_trainer.rst b/docs/source/examples/simple_trainer.rst
@@ -0,0 +1,28 @@
+Simple Trainer
+===================================
+
+.. currentmodule:: diff_rast
+
+Training on an image
+-----------------------------------
+The `examples/simple_trainer.py` script allows you to test the basic forward projection and rasterization of random gaussians
+and their differentiability on a single training image. This allows you to overfit gaussians on a single view.
+
+Simply run the script with:
+
+.. code-block:: python
+    :caption: simple_trainer.py
+
+    python examples/simple_trainer.py --height 256 --width 256 --num_points 2000 --save_imgs
+
+to get a result similar to the one below:
+
+.. image:: ../imgs/square.gif
+    :alt: Gaussians overfit on a single image
+    :width: 256
+
+You can also provide a path to your own custom image file using the ``--img_path`` flag:
+
+.. code-block:: python
+
+    python examples/simple_trainer.py --img_path PATH_TO_IMG --save_imgs
diff --git a/docs/source/imgs/square.gif b/docs/source/imgs/square.gif
diff --git a/docs/source/imgs/training.gif b/docs/source/imgs/training.gif