INDEX
Explanations
references to European countries and their associated events or cultural references
New Auto-Interp
Negative Logits
cea
-0.15
eras
-0.15
ing
-0.14
ÑģÑı
-0.13
cie
-0.13
enheim
-0.13
lik
-0.13
Pok
-0.13
etsk
-0.13
806
-0.13
POSITIVE LOGITS
anness
0.22
(ns
0.19
-wide
0.17
-based
0.17
-Israel
0.15
-China
0.15
ased
0.15
arden
0.15
ç±į
0.15
:path
0.14
Activations Density 0.311%