Skip to content

Commit

Permalink
Update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
raphodn committed Apr 1, 2024
1 parent 3585468 commit 7862959
Showing 1 changed file with 15 additions and 9 deletions.
24 changes: 15 additions & 9 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,11 +26,10 @@ All parameters are optional with the exception of user_agent, but here is a desc

- `username` and `password` are used to provide authentication (required for write requests)
- `country` is used to specify the country, which is used by the API to return product specific to the country or to infer which language to use by default. `world` (all products) is the default value
- `flavor`: the Open*Facts project you want to interact with: `off` (Open Food Facts, default), `obf` (Open Beauty Facts),...
- `flavor`: the Open*Facts project you want to interact with: `off` (Open Food Facts, default), `obf` (Open Beauty Facts), `opff` (Open Pet Food Facts), `opf` (Open Products Facts)
- `version`: API version (v2 is the default)
- `environment`: either `org` for production environment (openfoodfacts.org) or `net` for staging (openfoodfacts.net)


*Get information about a product*

```python
Expand All @@ -57,25 +56,32 @@ want to update. Example:

## Using the dataset

If you're planning to perform data analysis on Open Food Facts, the easiest way is to download and use the Open Food Facts dataset dump.
Fortunately it can be done really easily using the SDK:
If you're planning to perform data analysis on Open Food Facts, the easiest way is to download and use the Open Food Facts dataset dump. Fortunately it can be done really easily using the SDK:

```python
from openfoodfacts import ProductDataset

dataset = ProductDataset("csv")
dataset = ProductDataset(dataset_type="csv")

for product in dataset:
print(product["product_name"])
```

With `dataset = ProductDataset("csv")`, we automatically download (and cache) the dataset. We can then iterate over it to get information about products.
With `dataset = ProductDataset(dataset_type="csv")`, we automatically download (and cache) the food dataset. We can then iterate over it to get information about products.

Two dataset types are available `csv` and `jsonl`. The `jsonl` dataset contains all the Open Food Facts database information but takes much more storage (>5 GB), while the `csv` dataset is much ligher (~800 MB) but only contains the most important fields. The `jsonl` dataset type is used by default.

Two dataset types are available `csv` and `jsonl`. The `jsonl` dataset contains all the Open Food Facts database information but takes much more storage (>5 GB), while the `csv` dataset is much ligher (~700 MB) but only contains the most important fields.
You can also use `ProductDataset` to fetch other non-food datasets:

```python
from openfoodfacts import ProductDataset

The `jsonl` dataset type is used by default.
dataset = ProductDataset(dataset_type="csv")

for product in dataset:
print(product["product_name"])
```

## Taxonomies

For a deep dive on how to handle taxonomies, check out the [dedicated page](./handle_taxonomies.md).
For a deep dive on how to handle taxonomies, check out the [dedicated page](./handle_taxonomies.md).

0 comments on commit 7862959

Please sign in to comment.