Skip to content

Commit

Permalink
Formatting
Browse files Browse the repository at this point in the history
  • Loading branch information
kelpabc123 committed Jun 25, 2024
1 parent 9cf623d commit 14b4fb3
Showing 1 changed file with 10 additions and 9 deletions.
19 changes: 10 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,13 @@
## Model Description
The model is finetuned on the [WavLM base plus](https://arxiv.org/abs/2110.13900) with 2,374 hours of audio clips from
voice chat for multilabel classification.
The audio clips are automatically labeled using a synthetic data pipeline described in [our blog post](link to blog post here).
A single output can have multiple labels.
The model outputs a n by 6 output tensor where the inferred labels are `Profanity`, `DatingAndSexting`, `Racist`,
`Bullying`, `Other`, `NoViolation`. `Other` consists of policy violation categories with low prevalence such as drugs
and alcohol or self-harm that are combined into a single category.
The model is fine-tuned on the [WavLM base plus](https://arxiv.org/abs/2110.13900) with 2,374 hours of audio clips from
voice chat for multilabel classification. The audio clips are automatically labeled using a synthetic data pipeline
described in [our blog post](link to blog post here). A single output can have multiple labels. The model outputs a
n by 6 output tensor where the inferred labels are `Profanity`, `DatingAndSexting`, `Racist`, `Bullying`, `Other`,
`NoViolation`. `Other` consists of policy violation categories with low prevalence such as drugs and alcohol or
self-harm that are combined into a single category.

We evaluated this model on a dataset with human annotated labels that contained a total of 9795 samples with the class
distribution shown below. Note that we did not include the "other" category in this evaluation dataset.
We evaluated this model on a data set with human annotated labels that contained a total of 9,795 samples with the class
distribution shown below. Note that we did not include the "other" category in this evaluation data set.

|Class|Number of examples| Duration (hours)|% of dataset|
|---|---|---|---|
Expand All @@ -20,6 +19,8 @@ distribution shown below. Note that we did not include the "other" category in t


If we set the same threshold across all classes and treat the model as a binary classifier across all 4 toxicity classes (`Profanity`, `DatingAndSexting`, `Racist`, `Bullying`), we get a binarized average precision of 94.48%. The precision recall curve is as shown below.


<p align="center">
<img src="images/human_eval_pr_curve.png" alt="PR Curve" width="500"/>
</p>
Expand Down

0 comments on commit 14b4fb3

Please sign in to comment.