cleanlab · sanjanag · Nov 15, 2023 · Nov 13, 2023 · Nov 13, 2023 · Nov 15, 2023
diff --git a/README.md b/README.md
@@ -35,15 +35,14 @@ wget -nc 'https://cleanlab-public.s3.amazonaws.com/CleanVision/image_files.zip'
 ```python
 from cleanvision import Imagelab
 
-if __name__ == '__main__':
-    # Specify path to folder containing the image files in your dataset
-    imagelab = Imagelab(data_path="FOLDER_WITH_IMAGES/")
-
-    # Automatically check for a predefined list of issues within your dataset
-    imagelab.find_issues()
-
-    # Produce a neat report of the issues found in your dataset
-    imagelab.report()
+# Specify path to folder containing the image files in your dataset
+imagelab = Imagelab(data_path="FOLDER_WITH_IMAGES/")
+
+# Automatically check for a predefined list of issues within your dataset
+imagelab.find_issues()
+
+# Produce a neat report of the issues found in your dataset
+imagelab.report()
 ```
 
 2. CleanVision diagnoses many types of issues, but you can also check for only specific issues.
@@ -67,6 +66,7 @@ imagelab.report(issue_types=issue_types)
 - [Additional example notebooks](https://github.com/cleanlab/cleanvision-examples)
 - [Documentation](https://cleanvision.readthedocs.io/)
 - [Blog Post](https://cleanlab.ai/blog/cleanvision/)
+- [FAQ](https://cleanvision.readthedocs.io/en/latest/faq.html)
 
 ## *Clean* your data for better Computer *Vision*
 

diff --git a/...ce/cleanvision/dataset/folder_dataset.rst → ...ce/cleanvision/dataset/fsspec_dataset.rst b/...ce/cleanvision/dataset/folder_dataset.rst → ...ce/cleanvision/dataset/fsspec_dataset.rst
@@ -1,7 +1,7 @@
-Folder Dataset
+Fsspec Dataset
 ==============
 
-.. automodule:: cleanvision.dataset.folder_dataset
+.. automodule:: cleanvision.dataset.fsspec_dataset
    :autosummary:
    :members:
    :undoc-members:

diff --git a/docs/source/cleanvision/dataset/index.rst b/docs/source/cleanvision/dataset/index.rst
@@ -10,7 +10,7 @@ Dataset
 
 .. toctree::
     base_dataset
-    folder_dataset
+    fsspec_dataset
     hf_dataset
     torch_dataset
     utils
diff --git a/docs/source/faq.rst b/docs/source/faq.rst
@@ -0,0 +1,68 @@
+Frequently Asked Questions
+==========================
+
+Answers to frequently asked questions about the `cleanvision <https://github.com/cleanlab/cleanvision/>`_ open-source package.
+
+1. **What kind of machine learning tasks can I use CleanVision for?**
+
+CleanVision is independent of any machine learning tasks as it directly works on images and does not require and labels or metadata to detect issues in the dataset. The issues detected by CleanVision are helpful for all kinds of machine learning tasks.
+
+2. **Can I check for specific issues in my dataset?**
+
+
+Yes, you can specify issues like ``light`` or ``blurry`` in the issue_types argument when calling ``Imagelab.find_issues``
+
+.. code-block:: python3
+
+    imagelab.find_issues(issue_types={"light": {}, "blurry": {}})
+
+
+3. **What dataset formats does CleanVision support?**
+
+
+Apart from plain image files, CleanVision also works with HuggingFace and Torchvision datasets. You can use the dataset objects as is with the ``image_key`` argument.
+
+.. code-block:: python3
+
+    imagelab = Imagelab(hf_dataset=dataset, image_key="image")
+
+For more detailed usage instructions and examples, check the :ref:`tutorials`.
+
+Commonly encountered errors
+---------------------------
+
+- **RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase.**
+
+.. code-block:: console
+
+    This probably means that you are not using fork to start your
+    child processes and you have forgotten to use the proper idiom
+    in the main module:
+
+        if __name__ == '__main__':
+            freeze_support()
+            ...
+
+    The "freeze_support()" line can be omitted if the program
+    is not going to be frozen to produce an executable.
+
+    To fix this issue, refer to the "Safe importing of main module"
+    section in https://docs.python.org/3/library/multiprocessing.html
+
+
+The above issue is caused by multiprocessing module working differently for macOS and Windows platforms. A detailed discussion of the issue can be found `here <https://github.com/cleanlab/cleanlab/issues/159>`_.
+A fix around this issue is to run CleanVision in the main namespace like this
+
+.. code-block:: python3
+
+    if __name__ == "__main__":
+
+        imagelab = Imagelab(data_path)
+        imagelab.find_issues()
+        imagelab.report()
+
+OR use `n_jobs=1` to disable parallel processing:
+
+.. code-block:: python3
+
+    imagelab.find_issues(n_jobs=1)
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -4,47 +4,50 @@
 
 Documentation
 =======================================
+
 CleanVision automatically detects various issues in image datasets, such as images that are: (near) duplicates, blurry,
 over/under-exposed, etc. This data-centric AI package is designed as a quick first step for any computer vision project
 to find problems in your dataset, which you may want to address before applying machine learning.
 
 
 Installation
-============
-
-To install the latest stable version (recommended):
+------------
 
-.. code-block:: console
+.. tabs::
 
-   $ pip install cleanvision
+   .. tab:: pip
 
+      .. code-block:: bash
 
-To install the bleeding-edge developer version:
+         pip install cleanvision
 
-.. code-block:: console
+      To install the package with all optional dependencies:
 
-   $ pip install git+https://github.com/cleanlab/cleanvision.git
+      .. code-block:: bash
 
-To install with HuggingFace optional dependencies
+         pip install "cleanvision[all]"
 
-.. code-block:: console
+   .. tab:: source
 
-   $ pip install "cleanvision[huggingface]"
+      .. code-block:: bash
 
-To install with Torchvision optional dependencies
+         pip install git+https://github.com/cleanlab/cleanvision.git
 
-.. code-block:: console
+      To install the package with all optional dependencies:
 
-   $ pip install "cleanvision[pytorch]"
+      .. code-block:: bash
 
+         pip install "git+https://github.com/cleanlab/cleanvision.git#egg=cleanvision[all]"
 
 
 
 
-Quickstart
-===========
+How to Use CleanVision
+----------------------
 
-1. Using CleanVision to audit your image data is as simple as running the code below:
+Basic Usage
+^^^^^^^^^^^
+Here's how to quickly audit your image data:
 
 
 .. code-block:: python3
@@ -60,8 +63,9 @@ Quickstart
     # Produce a neat report of the issues found in your dataset
     imagelab.report()
 
-2. CleanVision diagnoses many types of issues, but you can also check for only specific issues:
-
+Targeted Issue Detection
+^^^^^^^^^^^^^^^^^^^^^^^^
+You can also focus on specific issues:
 
 .. code-block:: python3
 
@@ -72,8 +76,9 @@ Quickstart
     # Produce a report with only the specified issue_types
     imagelab.report(issue_types.keys())
 
-3. Run CleanVision on a Hugging Face dataset
-
+Integration with Hugging Face Dataset
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Easily use CleanVision with a Hugging Face dataset:
 
 .. code-block:: python3
 
@@ -90,7 +95,9 @@ Quickstart
 
     imagelab.report()
 
-4. Run CleanVision on a Torchvision dataset
+Integration with Torchvision Dataset
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+CleanVision works smoothly with Torchvision datasets too:
 
 
 .. code-block:: python3
@@ -111,29 +118,32 @@ Quickstart
     imagelab.report()
 
 
-More on how to get started with CleanVision:
-- `Example Python script <https://github.com/cleanlab/cleanvision/blob/main/docs/source/tutorials/run.py>`_
-- `Example Notebooks <https://github.com/cleanlab/cleanvision-examples>`_
-- `How To Contribute <https://github.com/cleanlab/cleanvision/blob/main/CONTRIBUTING.md>`_
+Additional Resources
+--------------------
+- Get started with our `Example Notebook <https://cleanvision.readthedocs.io/en/latest/tutorials/tutorial.html>`_
+- Explore more `Example Notebooks <https://github.com/cleanlab/cleanvision-examples>`_
+- Learn how to contribute in the `Contribution Guide <https://github.com/cleanlab/cleanvision/blob/main/CONTRIBUTING.md>`_
 
 
 .. toctree::
    :hidden:
-   :maxdepth: 1
-   :caption: Getting Started
 
    Quickstart <self>
-.. _api-reference:
 
+
+.. _tutorials:
 .. toctree::
    :hidden:
    :maxdepth: 3
    :caption: Tutorials
+   :name: _tutorials
 
-   tutorials/tutorial.ipynb
+   How to Use CleanVision <tutorials/tutorial.ipynb>
    tutorials/torchvision_dataset.ipynb
    tutorials/huggingface_dataset.ipynb
+   Frequently Asked Questions <faq>
 
+.. _api-reference:
 .. toctree::
    :hidden:
    :maxdepth: 3
@@ -153,3 +163,4 @@ More on how to get started with CleanVision:
    GitHub <https://github.com/cleanlab/cleanvision.git>
    PyPI <https://pypi.org/project/cleanvision/>
    Cleanlab Studio <https://cleanlab.ai/studio/?utm_source=cleanvision&utm_medium=docs&utm_campaign=clostostudio>
+
diff --git a/docs/source/tutorials/tutorial.ipynb b/docs/source/tutorials/tutorial.ipynb
@@ -5,7 +5,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Overview"
+    "# How to Use CleanVision"
    ]
   },
   {