detecting-sexism

In the context of the final project for Le Wagon Data Science & AI course, we have built an NLP model able to detect sexism in text. The initial classification will be a binary 'sexist' or 'not sexist'.

As opposed to previous research and models, we used a dataset composed of 6 text corpi, not limited to social media.

As the datasets use different annotation rules, we hoped to provide a richness and nuance to the model, at the risk of being 'overly accusatory'.

Our final model, which can be tested here, is a fine-tuned BERT model, trained for Precision.

The options for further exploration include, but are not limited to:

Developing a french model
Developing a multi-lingual model
Augmenting our dataset(s)
- Translation
- Scraping reddit/instagram/youtube/twitch
- Generative AI
- Text templates (eg, "I hate women, they're all (bitch/slut/whore)s")
Annotating our own data set based on gender theory and language theory
Multi-class classification

Name		Name	Last commit message	Last commit date
Latest commit History 198 Commits
app		app
mlruns/0		mlruns/0
models_git		models_git
results		results
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
fast.py		fast.py
intialize_c1d.h5		intialize_c1d.h5
makefile		makefile
params.py		params.py
requirements-prod.txt		requirements-prod.txt
requirements.txt		requirements.txt
setup.py		setup.py
stop.sh		stop.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

detecting-sexism

About

Releases

Packages

Contributors 4

Languages

Esmedd/detecting-sexism

Folders and files

Latest commit

History

Repository files navigation

detecting-sexism

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages