INDEX
Explanations
social and political movements
New Auto-Interp
Negative Logits
are
1.42
i
1.27
in
1.23
that
1.22
p
1.17
ö
1.16
ite
1.09
om
1.06
int
1.06
。
1.05
POSITIVE LOGITS
ت
1.45
т
1.29
Movement
1.16
movement
1.13
encontr
1.09
)";
1.03
).$$
1.02
powied
0.99
Movement
0.98
kutoka
0.98
Activations Density 0.012%