INDEX
Explanations
words related to names and titles of individuals
references to specific individuals, particularly the name "Abe."
New Auto-Interp
Negative Logits
igans
-0.73
Cassidy
-0.66
LINE
-0.64
otle
-0.62
ivity
-0.62
oner
-0.60
aldehyde
-0.60
Telephone
-0.60
Xie
-0.60
vans
-0.58
POSITIVE LOGITS
ment
0.84
llor
0.82
ful
0.81
tered
0.81
untarily
0.78
cy
0.78
inel
0.78
tle
0.74
uer
0.74
ures
0.74
Activations Density 0.043%