INDEX
Explanations
instances of political positions and their interactions
New Auto-Interp
Negative Logits
RICT
-0.16
eref
-0.15
oves
-0.15
æĸŃ
-0.15
yre
-0.15
اÙĦÙĪÙĦ
-0.14
ysa
-0.14
oufl
-0.14
ило
-0.14
uchen
-0.14
POSITIVE LOGITS
hum
0.16
unt
0.16
Vulner
0.15
112
0.15
urg
0.14
sor
0.14
chk
0.14
Disclosure
0.14
vulner
0.14
razil
0.14
Activations Density 0.278%