INDEX
Explanations
phrases related to policy and socio-economic issues, such as minimum wage, public education, and taxes
New Auto-Interp
Negative Logits
involved
-0.69
cture
-0.60
cific
-0.54
endum
-0.54
minist
-0.53
neau
-0.52
iral
-0.51
insider
-0.50
ãĤ¨ãĥ«
-0.50
ersen
-0.49
POSITIVE LOGITS
etc
0.60
and
0.56
(.
0.55
ĵĺ
0.55
mathemat
0.53
Gujar
0.53
(!
0.52
Ĥ¬
0.51
,.
0.49
à¼
0.49
Activations Density 0.691%