INDEX
Explanations
adjectives describing the intensity or importance of specific attributes
adjectives and adverbs conveying strong negative or evaluative sentiments
New Auto-Interp
Negative Logits
DRAG
-0.70
azo
-0.61
asio
-0.61
ynski
-0.60
cision
-0.57
Flavoring
-0.57
adish
-0.56
Vaults
-0.56
omething
-0.56
BF
-0.55
POSITIVE LOGITS
haus
0.77
accessible
0.66
unatt
0.66
ripe
0.63
rongh
0.63
hit
0.61
readable
0.60
vind
0.60
competitive
0.60
unsustainable
0.59
Activations Density 0.245%