INDEX
Explanations
phrases or words indicating extreme negativity or criticism
the word "utter" and its variations to indicate strong emphasis or negative evaluations
New Auto-Interp
Negative Logits
pei
-0.79
amsung
-0.79
apego
-0.71
oult
-0.70
anwhile
-0.69
xual
-0.67
aspberry
-0.67
OHN
-0.67
elled
-0.67
abbit
-0.66
POSITIVE LOGITS
utter
1.14
ance
0.84
ances
0.83
uttered
0.81
aloud
0.80
amaz
0.80
TY
0.76
ingly
0.76
çIJ
0.73
nesses
0.71
Activations Density 0.003%