INDEX
Explanations
references to historical events or notable figures
New Auto-Interp
Negative Logits
olson
-0.16
Joe
-0.15
GTA
-0.15
ptal
-0.14
Israeli
-0.14
adlo
-0.14
kola
-0.14
andi
-0.14
Samoa
-0.14
antan
-0.14
POSITIVE LOGITS
Tud
0.41
Henry
0.35
Elizabeth
0.33
Protestant
0.30
Catholic
0.29
Cardinal
0.28
Henry
0.28
Catholics
0.28
Anne
0.28
Wyatt
0.27
Activations Density 0.027%