INDEX
Explanations
instances of contemptuous behavior or situations depicting contempt
instances of the word "contempt" and related terms reflecting disapproval or disdain
New Auto-Interp
Negative Logits
hemor
-0.70
encyclopedia
-0.70
ramid
-0.66
toget
-0.64
Lans
-0.64
livest
-0.64
scen
-0.64
akeru
-0.63
misunder
-0.63
opio
-0.62
POSITIVE LOGITS
uously
1.45
uous
1.38
fully
1.19
ful
1.06
ible
0.93
ibly
0.91
uality
0.88
urous
0.88
acy
0.86
ateurs
0.85
Activations Density 0.036%