INDEX
Explanations
proper nouns, especially names related to specific people and places
New Auto-Interp
Negative Logits
Meksiku
-0.57
Elbe
-0.52
WIS
-0.51
bụ
-0.51
}")]
-0.50
Osama
-0.50
thail
-0.49
Skocz
-0.49
)$}
-0.49
Mombasa
-0.49
POSITIVE LOGITS
Armenian
1.06
Armenia
0.98
armen
0.95
Yerevan
0.90
Armen
0.88
Armenians
0.86
Armenia
0.86
armen
0.84
Armen
0.83
erevan
0.72
Activations Density 0.059%