This repository contains annotated data meant to be used with supervised machine learning algorithms. The dataset was created specifically to classify sentences to the root comments of Danish political articles on social media.
The dataset consists of 9008 sentences that are labelled with fine-grained polarity in the range from -2 to 2 (negative to postive). The quality of the fine-grained is not cross validated and is therefore subject to uncertainties; however, the simple polarity has been cross validated and therefore is considered to be more correct.
The distribution of the classes:
The distribution of platforms collected from:
2:
- Positive action with positive intensifier.
- Direct positive sentiment towards topics, people or objects.
- Many tokens that indicate positivity e.g
1:
- Turns something negative into something positive.
- Something positive is being referred to.
- Positive tone/message.
- Wishes something positive without being eagerly formulated.
- Subtle expression of positive attitude towards individuals/person/objects.
- Few or no tokens that indicate positivity.
- A sentence that is near impossible to use negatively (yet often positively).
- They express agreement.
0:
- A sentence that can only be understood in context is inconclusive.
- That which is not immediate positive nor negative.
- Sentences that are not inherently positive or negative.
- Impartial expressions.
- Objective descriptions.
-1:
- They express disagreement.
- Tokens that indicate negativity e.g :(.
- Turns something positive into something negative.
- Something negative is being referred to.
- Negative tone/message.
- Wishes something negative without said being eagerly formulated.
- Subtle expression of negative attitude towards individuals/person/objects.
- A sentence that is near impossible to use positively (yet often negatively).
-2:
- Strong negative intensifiers or cursing.
- Direct/clear negative sentiment towards topics, people or objects.
- Many negative tokens e.g " :( :( :( :( :( ".