INDEX
Explanations
terms related to changes in opinions or positions
New Auto-Interp
Negative Logits
gaard
-0.16
rive
-0.15
ắp
-0.15
pf
-0.15
ova
-0.14
nde
-0.14
pa
-0.14
inge
-0.13
dio
-0.13
oso
-0.13
POSITIVE LOGITS
Fauc
0.15
ocs
0.15
semble
0.15
пеÑĢеÑģ
0.14
ews
0.14
rawn
0.14
bends
0.14
ÙĨØ®
0.14
ÑģÑĤанов
0.14
SEMB
0.14
Activations Density 0.216%