Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text experiements: test sentiments #465

Open
1 of 2 tasks
elboyran opened this issue Feb 14, 2023 · 5 comments
Open
1 of 2 tasks

Text experiements: test sentiments #465

elboyran opened this issue Feb 14, 2023 · 5 comments
Assignees
Labels

Comments

@elboyran
Copy link
Contributor

elboyran commented Feb 14, 2023

Design a list of at least 25 positive and negative words using the sentiment scale of the Stanford sentiment treebank.
One way to find out the sentiments is to use the browsing capability in the dataset above by limiting the sentence length to 3 words (possibly more tokens).

Another might be to look at the indexed original dataset's sentiment labels (normalized between 0 1nd 1)-

sentiment_labels.txt
or other relevant files.

Here is the parent link to the original and derived datasets on Surfdrive.

Create 25 test sentences of length 3 containing the above words, one sentence each. Perhaps as simple as

This is terrible
This is great
This is marvelous

  • Store the words in this issue.

  • Store the sentences in a .tsv file in the same format as the model's test data on Surfdrive.

Stems from #445 and dianna-ai/dianna-exploration#187 (see for Practicalities).

@elboyran
Copy link
Contributor Author

@elboyran
Copy link
Contributor Author

Related to dianna-exploration PR 159.

@WillemSpek don't forget to link the issues here with the PRs in the other repo ;-)

@elboyran
Copy link
Contributor Author

Also, there's something odd about this PR (159) - it links to the images code, not to the text one. I cannot locate the work you did for the text (it's not the the Text branch). Please, fix the code <-> PR link.

@elboyran
Copy link
Contributor Author

elboyran commented Jan 15, 2024

Compiled a list of words (adjectives) from the Stanford movie reviews dataset to chose test data from for the Lorentz workshop ICT with industry usecase. Most of the words appear witht he same score over the reviews int he dataset! When I have found different score that's indicated next to the word.

list of sentiment adjectives found in the Stanford move reviews sentiment dataset and model

word positivity score (scale is from 1 (max negative) to 25 (max positive))

word positivity score(s)

baaaaaad 1

disgusting 1.75
dreadful 1.75

irritating 2

vulgar 2.3
horrible 2.3
unlikable 2.3

dissapointing 3 4.75 (combined with sligthly?)
pathetic 3

pointless 3.3
bad 3.75 4

depressing 4.75
worst 4.75

dull 5
appalling 5

boring 5.3

stupid 5.75

monotonous 6
cold 6

terrible 6.3
bizzare 6.3

unimaginative 6.75

nasty 7
tired 7

pitiful 7.3

awkward 7.75
mean 7.75
flawed 7.75

clunky 8
painful 8

rotten 8.3

shrewd 8.75 13.3

ugly 9
disguised 9

cliched 9.3
creepy 9.3

pretentious 9.75

overwhelming 10
lacking 10

obvious 10.3
redundant 10.3

bewildered 10.75

awful 11
grouchy 11
manipulative 11
vague 11

coarse 11.75 12.3

dark 11.75
mercenery 11.75
sordid 11.75

freak 11.3

restrained 12

spiritless 12.3
pressed 12.3
satisfactory 12.3

conventional 12.75

serious 13
light 13
ironic 13
extreme 13 13.75

melodramatic 13.3
predictable 13.3

earnest 13.75
easy 13.75

fast 14.3
superficial 14.3
emotional 14.3

ballistic 14.75

driven 15

smooth 15.3

artful 15.75

silly 16

cerebral 16.3
comitted 16.3

acclaimed 16.5
artsy 16.75
stimulating 16.75

convinient 17
strong 17

exceeds 17.3

curious 17.75
gritty 17.75
gorgeous 17.75

subtle 18

poetic 18.3
interresting 18.3
charismatic 18.3

good 18.75

cinematic 14.3 19
astounding 19

fun 19.3
fantastic 19.3
pleasurable 19.3

good 19.75
appealing 19.75
funny 19.75
noteworthy 19.75

clever 20.3

engaging 20.75
happy 20.75

better 21 (also 19.75)

pretty 22
amusing 22
geniune 22

dazzling 22.75
entertaining 22.75
delightful 22.75

great 23.5

fabulous 24

brilliant 24.3

perfection 24.75

masterpiece 25

@elboyran elboyran reopened this Jan 15, 2024
@elboyran
Copy link
Contributor Author

elboyran commented Jan 17, 2024

Simplified to integer values and adjectives only:

word score

worthless 1
irritating 2
excruciating 3
bad 4
nasty 5
lackluster 6
dizzying 7
clunky 8
tedious 9
confusing 10
grimy 11
stagy 12
intimate 13
visual 14
indelible 15
beguiling 16
modest 17
inventive 18
ultimate 19
epic 20
better 21
successful 22
excellent 23
fabulous 24
spectacular 25

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants