Skip to content

Commit

Permalink
More improvements
Browse files Browse the repository at this point in the history
  • Loading branch information
nataliaElv committed Nov 20, 2024
1 parent 5b03fdd commit 3f54096
Show file tree
Hide file tree
Showing 4 changed files with 24 additions and 38 deletions.
4 changes: 2 additions & 2 deletions chapters/en/chapter10/2.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,12 @@ Let's connect with our Argilla instance. To do that you will need the following
```python
import argilla as rg

HF_TOKEN = "..." # only for private spaces
HF_TOKEN = "..." # only for private spaces

client = rg.Argilla(
api_url="...",
api_key="...",
headers={"Authorization": f"Bearer {HF_TOKEN}"} # only for private spaces
headers={"Authorization": f"Bearer {HF_TOKEN}"}, # only for private spaces
)
```

Expand Down
29 changes: 11 additions & 18 deletions chapters/en/chapter10/3.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,12 @@ The first step is to connect to our Argilla instance as we did in the previous s
```python
import argilla as rg

HF_TOKEN = "..." # only for private spaces
HF_TOKEN = "..." # only for private spaces

client = rg.Argilla(
api_url="...",
api_key="...",
headers={"Authorization": f"Bearer {HF_TOKEN}"} # only for private spaces
headers={"Authorization": f"Bearer {HF_TOKEN}"}, # only for private spaces
)
```

Expand All @@ -46,22 +46,18 @@ It contains a `text` and also some initial labels for the text classification. W

```python
settings = rg.Settings(
fields=[
rg.TextField(name="text")
],
fields=[rg.TextField(name="text")],
questions=[
rg.LabelQuestion(
name="label",
title="Classify the text:",
labels=data.unique("label_text")
name="label", title="Classify the text:", labels=data.unique("label_text")
),
rg.SpanQuestion(
name="entities",
title="Highlight all the entities in the text:",
labels=["PERSON", "ORG", "LOC", "EVENT"],
field="text"
)
]
name="entities",
title="Highlight all the entities in the text:",
labels=["PERSON", "ORG", "LOC", "EVENT"],
field="text",
),
],
)
```

Expand All @@ -79,10 +75,7 @@ To learn more about all the available types of fields and questions and other ad
Now that we've defined some settings, we can create the dataset:

```python
dataset = rg.Dataset(
name="ag_news",
settings=settings
)
dataset = rg.Dataset(name="ag_news", settings=settings)

dataset.create()
```
Expand Down
9 changes: 4 additions & 5 deletions chapters/en/chapter10/4.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

##TODO: Add screenshots!

Now it is time to start working from the Argilla UI to annotate your dataset.
Now it is time to start working from the Argilla UI to annotate our dataset.

## Align your team with annotation guidelines

Expand All @@ -20,9 +20,8 @@ Sometimes, you want to have more than one submitted response per record, for exa

## Annotate records

<Tip>
💡 If you are deploying Argilla in a Hugging Face Space, any team members will be able to log in using the Hugging Face OAuth. Otherwise, you may need to create users for them following [this guide](https://docs.argilla.io/latest/how_to_guides/user/).
</Tip>
>[!TIP]
>💡 If you are deploying Argilla in a Hugging Face Space, any team members will be able to log in using the Hugging Face OAuth. Otherwise, you may need to create users for them following [this guide](https://docs.argilla.io/latest/how_to_guides/user/).
When you open your dataset, you will realize that the first question is already filled in with some suggested labels. That's because in the previous section we mapped our question called `label` to the `label_text` column in the dataset, so that we simply need to review and correct the already existing labels. For the token classification, we'll need to add all labels manually, as we didn't include any suggestions.

Expand All @@ -34,7 +33,7 @@ As you move through the different records, there are different actions you can t
In the next section, you will learn how you can export and use those annotations.

---

Examples of images from other chapters:
<a class="flex justify-center" href="/huggingface-course/bert-finetuned-ner">
<img class="block dark:hidden lg:w-3/5" src="https://huggingface.co/datasets/huggingface-course/documentation-images/resolve/main/en/chapter7/model-eval-bert-finetuned-ner.png" alt="One-hot encoded labels for question answering."/>
<img class="hidden dark:block lg:w-3/5" src="https://huggingface.co/datasets/huggingface-course/documentation-images/resolve/main/en/chapter7/model-eval-bert-finetuned-ner-dark.png" alt="One-hot encoded labels for question answering."/>
Expand Down
20 changes: 7 additions & 13 deletions chapters/en/chapter10/5.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,12 @@ First, we'll need to make sure that we're connected to our Argilla instance as i
```python
import argilla as rg

HF_TOKEN = "..." # only for private spaces
HF_TOKEN = "..." # only for private spaces

client = rg.Argilla(
api_url="...",
api_key="...",
headers={"Authorization": f"Bearer {HF_TOKEN}"} # only for private spaces
headers={"Authorization": f"Bearer {HF_TOKEN}"}, # only for private spaces
)
```

Expand All @@ -31,26 +31,20 @@ Loading the dataset and calling its records with `dataset.records` is enough to
Sometimes you only want to use the records that have been completed, so we will first filter the records in our dataset based on their status:

```python
status_filter = rg.Query(
filter=rg.Filter(
[
("status", "==", "completed")
]
)
)
status_filter = rg.Query(filter=rg.Filter([("status", "==", "completed")]))

filtered_records = dataset.records(status_filter)
```

<Tip>
⚠️ Note that the records could have more than one response and that each of them can have any status from `submitted`, `draft` or `discarded`.
</Tip>
>[!TIP]
>⚠️ Note that the records could have more than one response and that each of them can have any status from `submitted`, `draft` or `discarded`.

Learn more about querying and filtering records in the [Argilla docs](https://docs.argilla.io/latest/how_to_guides/query/).

## Export to the Hub

We can now export our records to a Dataset in the Hugging Face Hub, so we can share our annotations with others. To do this, we'll need to convert the records into a Dataset and then push it to the Hub:
We can now export our annotations to the Hugging Face Hub, so we can share them with others. To do this, we'll need to convert the records into a 🤗 Dataset and then push it to the Hub:

```python
filtered_records.to_datasets().push_to_hub("argilla/ag_news_annotated")
Expand Down

0 comments on commit 3f54096

Please sign in to comment.