INDEX
Explanations
country names
country names or references to nations
New Auto-Interp
Negative Logits
ãĥ«
-0.70
TA
-0.69
kt
-0.68
atu
-0.67
sta
-0.67
ista
-0.67
Ult
-0.66
Tan
-0.66
Tab
-0.65
Tab
-0.64
POSITIVE LOGITS
gray
0.85
311
0.79
GN
0.78
310
0.77
Gray
0.76
GP
0.74
Gre
0.73
361
0.73
hin
0.73
Gray
0.72
Activations Density 0.468%