INDEX
Explanations
European countries and geography
New Auto-Interp
Negative Logits
i
0.72
b
0.59
k
0.53
on
0.50
d
0.49
on
0.49
कर्ता
0.47
a
0.47
ne
0.46
mantra
0.45
POSITIVE LOGITS
Italy
0.55
Switzerland
0.54
Germany
0.51
Germany
0.51
Italy
0.51
、
0.49
Itália
0.48
Danube
0.46
Switzerland
0.46
Spain
0.46
Activations Density 0.413%