INDEX
Explanations
names of individuals
mentions of specific locations and names associated with them
New Auto-Interp
Negative Logits
awar
-0.96
rylic
-0.94
insula
-0.91
urgical
-0.83
nesota
-0.83
allion
-0.82
emonic
-0.81
icts
-0.80
jriwal
-0.79
manac
-0.79
POSITIVE LOGITS
terday
0.76
Thornton
0.72
Beir
0.69
=-=-
0.68
hood
0.67
lyn
0.66
Debor
0.66
dale
0.66
Brom
0.65
Ń·
0.65
Activations Density 0.027%