INDEX
Explanations
names of locations and institutions, especially in France
New Auto-Interp
Negative Logits
lfw
-0.17
Mash
-0.17
perial
-0.16
pei
-0.16
CGColor
-0.15
Leaf
-0.15
apor
-0.15
微软éĽħé»ij
-0.14
cion
-0.14
Vere
-0.14
POSITIVE LOGITS
France
0.33
France
0.30
Paris
0.28
Paris
0.25
ÙģØ±Ø§ÙĨسÙĩ
0.24
French
0.24
france
0.24
ÐŁÐ°ÑĢи
0.24
ÑĦÑĢанÑĨÑĥз
0.23
Fransız
0.23
Activations Density 0.319%