INDEX
Explanations
terms and phrases related to liberalism and political ideology
New Auto-Interp
Negative Logits
eger
-0.17
Ø©
-0.16
ish
-0.16
ings
-0.16
izza
-0.15
кÑĥл
-0.14
á»§i
-0.14
avorites
-0.14
Queries
-0.13
puberty
-0.13
POSITIVE LOGITS
elle
0.16
ialis
0.16
inalg
0.15
rett
0.15
542
0.14
IGO
0.14
PTR
0.14
bersome
0.14
sss
0.14
undi
0.14
Activations Density 0.023%