INDEX
Explanations
statements or opinions regarding current political events
New Auto-Interp
Negative Logits
fortun
-0.78
Marketable
-0.75
smoot
-0.71
photoc
-0.70
ãĥ¼ãĥĨ
-0.68
pastry
-0.66
icing
-0.66
outl
-0.66
typew
-0.66
bes
-0.66
POSITIVE LOGITS
Ļ
1.58
¬
1.19
ħ
1.14
į
1.11
ª
1.06
Ĵ
1.05
¤
1.02
£
1.00
µ
0.99
·
0.99
Activations Density 0.417%