INDEX
Explanations
proper names followed by "himself."
references to individuals in a self-referential context
New Auto-Interp
Negative Logits
olid
-0.88
okin
-0.71
CLOSE
-0.71
MSN
-0.70
ulton
-0.68
Conversation
-0.67
ayne
-0.67
urry
-0.66
ONES
-0.66
microsoft
-0.65
POSITIVE LOGITS
ortium
0.78
profess
0.73
contradicted
0.73
proport
0.73
predec
0.71
confessed
0.71
doct
0.71
greets
0.70
hars
0.70
conclud
0.67
Activations Density 0.040%