INDEX
Explanations
expressions indicating personal beliefs or opinions on governmental matters
statements related to opinions on what the government ought to do
New Auto-Interp
Negative Logits
ZI
-0.71
Fra
-0.67
atile
-0.66
ROS
-0.65
ãĤ¼ãĤ¦ãĤ¹
-0.65
Puzzle
-0.64
reality
-0.64
Saiyan
-0.63
locked
-0.63
GGGGGGGG
-0.61
POSITIVE LOGITS
ered
1.11
be
1.00
ering
0.89
©¶æ
0.83
nt
0.83
ideally
0.81
beware
0.81
bes
0.79
ĪĴ
0.78
aspire
0.77
Activations Density 0.066%