INDEX
Explanations
countries and political entities
entities related to political or social groups and their affiliations
New Auto-Interp
Negative Logits
Stam
-0.64
antha
-0.62
ËĪ
-0.61
Aval
-0.60
Loyal
-0.60
Lent
-0.59
_.
-0.57
().
-0.57
±
-0.56
Jah
-0.56
POSITIVE LOGITS
alike
1.78
are
1.07
were
1.06
respectively
0.98
weren
0.92
aren
0.92
collide
0.87
have
0.86
remain
0.85
collided
0.80
Activations Density 0.382%