ua_datasets

UA-datasets is a collection of Ukrainian language datasets. Our aim is to build a benchmark for research related to natural language processing in Ukrainian.

This library is provided by FIdo.ai (machine learning research division of the non-profit student's organization FIdo, National University of Kyiv-Mohyla Academy) for research purposes.

Installation

The library can be installed from PyPi in your virtual environment (e.g. venv, conda env)

pip install ua_datasets

Latest Updates

05.07.22 - Added HuggingFace API for Q&A (UA-SQuAD) and Text Classification (UA-News) datasets

Available Datasets

Question Answering (UA-SQuAD)
Text Classification (UA-News)
Token Classification (Mova Institute Part of Speech)

Contribution

In case you are willing to contribute (update any part of the library, add your dataset) do not hesitate to connect through GitHub Issue. Thanks in advance for your contribution! Let's make the Ukrainian language even greater!

Citation

@software{ua_datasets_2021,
  author = {Ivanyuk-Skulskiy, Bogdan and Zaliznyi, Anton and Reshetar, Oleksand and Protsyk, Oleksiy and Romanchuk, Bohdan and Shpihanovych, Vladyslav},
  month = oct,
  title = {ua_datasets: a collection of Ukrainian language datasets},
  url = {https://github.com/fido-ai/ua-datasets},
  version = {0.0.1},
  year = {2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 195 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
imgs		imgs
test		test
ua_datasets		ua_datasets
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ua_datasets

Installation

Latest Updates

Available Datasets

Contribution

Citation

About

Releases 4

Packages

Contributors 5

Languages

License

fido-ai/ua-datasets

Folders and files

Latest commit

History

Repository files navigation

ua_datasets

Installation

Latest Updates

Available Datasets

Contribution

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 4

Packages 0

Contributors 5

Languages

Packages