INDEX
Explanations
adjectives related to negative characteristics or attitudes
negative descriptors related to unpleasantness or harm
New Auto-Interp
Negative Logits
Illum
-0.74
sych
-0.73
vim
-0.71
Emblem
-0.68
sonian
-0.67
ilot
-0.66
ĸļ
-0.65
med
-0.65
Interpret
-0.64
Sets
-0.64
POSITIVE LOGITS
nasty
2.82
degradation
2.15
degrading
2.06
degraded
1.95
degrade
1.79
vile
1.49
vicious
1.44
brut
1.32
unpleasant
1.29
horrible
1.22
Activations Density 0.033%