INDEX
Explanations
negative or cautionary statements regarding political actions and their consequences
New Auto-Interp
Negative Logits
Previously
-0.76
Ö¼
-0.75
querque
-0.71
iliated
-0.70
igslist
-0.68
ĵ
-0.68
Later
-0.68
arag
-0.68
ãĤ¶
-0.68
onds
-0.67
POSITIVE LOGITS
ALWAYS
1.01
whichever
0.94
regardless
0.87
whatever
0.85
always
0.84
invariably
0.81
nevertheless
0.79
irrespective
0.78
whatsoever
0.76
whatever
0.75
Activations Density 0.122%