INDEX
Explanations
references to historical events and figures
New Auto-Interp
Negative Logits
arella
-0.18
commod
-0.18
ibold
-0.17
mates
-0.16
abilia
-0.15
Sesso
-0.15
odia
-0.15
mate
-0.15
Kate
-0.14
Gian
-0.14
POSITIVE LOGITS
Glad
0.26
Winston
0.26
Churchill
0.25
Prime
0.22
Prime
0.21
Church
0.21
Conservative
0.21
LORD
0.20
Lord
0.20
Lords
0.18
Activations Density 0.058%