by Francisco Herrera, Francisco Charte, Antonio J. Rivera, María J. del Jesus
Springer, 2016.
This repository provides the multilabel datasets used throughout the chapters of the book Multilabel Classification - Problem analysis, metrics and techniques, as well as some code and links. Click the folder corresponding the chapters (in the list above) to download the files you are interested in. You can also clone the entire repository, as well as to download it as a ZIP file.
This book offers a comprehensive review of multilabel techniques widely used to classify and label texts, pictures, videos and music in the Internet. A deep review of the specialized literature on the field includes the available software needed to work with this kind of data. It provides the user with the software tools needed to deal with multilabel data, as well as step by step instruction on how to use them. The main topics covered are:
- The special characteristics of multi-labeled data and the metrics available to measure them.
- The importance of taking advantage of label correlations to improve the results.
- The different approaches followed to face multi-label classification.
- The preprocessing techniques applicable to multi-label datasets.
- The available software tools to work with multi-label data.
This book is beneficial for professionals and researchers in a variety of fields because of the wide range of potential applications for multilabel classification. Besides its multiple applications to classify different types of online information, it is also useful in many other areas, such as genomics and biology. No previous knowledge about the subject is required. The book introduces all the needed concepts to understand multilabel data characterization, treatment and evaluation.
The following is a summary of links provided through the book. Each chapter's folder also contains the links which are relevant to the studied topic.
Most of the available multilabel datasets can be obtained from the following data repositories:
R Ultimate Multilabel Dataset Repository - RUMDR
Extreme Classification Repository
The following are links to multilabel software tools:
mldr
R package at CRAN - GitHub
mldr.datasets
R package at CRAN - GitHub
Synthetic Dataset Generator for Multi-label Learning - Mldatagen
ML-TREE - Source code with the reference implementation of the ML-TREE algorithm
Rank-SVM - Source code with the reference implementation of the Rank-SVM algorithm
MLSMOTE - Java implementation of the MLSMOTE algorithm
MLeNN - Java implementation of the MLeNN algorithm
REMEDIAL - R source code for the REMEDIAL algorithm
This content is licensed under LGPLv3. What does this mean?