INDEX
Explanations
geographical names or locations
New Auto-Interp
Negative Logits
Civilization
-0.17
aci
-0.15
udad
-0.15
ML
-0.15
ì¡
-0.14
arsi
-0.14
ÙĦÙĪØ¯
-0.14
ander
-0.14
eg
-0.13
ensch
-0.13
POSITIVE LOGITS
shire
0.30
ople
0.17
emme
0.15
eshire
0.15
oulos
0.15
.va
0.15
zcze
0.15
ÑĪиÑĢ
0.15
avra
0.15
inha
0.15
Activations Density 0.065%