INDEX
Explanations
terms related to political ideologies, particularly liberalism and conservatism
New Auto-Interp
Negative Logits
aday
-0.16
jee
-0.15
ose
-0.15
VEC
-0.15
istrovstvÃŃ
-0.15
rei
-0.15
.INSTANCE
-0.14
ings
-0.14
ee
-0.14
ÑĢек
-0.14
POSITIVE LOGITS
-leaning
0.28
/lib
0.27
-lib
0.18
credentials
0.17
-rad
0.17
-minded
0.17
lero
0.16
ød
0.16
jerne
0.16
ness
0.16
Activations Density 0.046%