INDEX
Explanations
expressions that indicate group participation or collective experiences
New Auto-Interp
Negative Logits
ilan
-0.17
udo
-0.15
inka
-0.15
asto
-0.15
åĹ
-0.15
BJ
-0.14
illet
-0.14
imizde
-0.14
YLE
-0.14
UDO
-0.14
POSITIVE LOGITS
ertz
0.17
719
0.17
ivor
0.17
ubar
0.16
axon
0.15
Patri
0.15
Axis
0.15
æľĽ
0.15
FD
0.15
ียม
0.15
Activations Density 0.048%