INDEX
Explanations
mentions of specific names as pairs or individuals
instances of individuals' names or titles in context
New Auto-Interp
Negative Logits
olutions
-0.84
eco
-0.77
ipt
-0.68
itizens
-0.67
izen
-0.65
Higher
-0.64
metab
-0.64
adaptations
-0.64
outputs
-0.63
pec
-0.61
POSITIVE LOGITS
whom
1.60
who
1.27
who
1.24
whose
1.14
whose
1.13
Jr
1.11
Sr
0.95
Jr
0.88
although
0.85
albeit
0.81
Activations Density 0.271%