INDEX
Explanations
sentences expressing an opinion about what actions should be taken
phrases calling for moral or legal accountability
New Auto-Interp
Negative Logits
ullivan
-0.76
Wid
-0.67
Cros
-0.65
lip
-0.63
Others
-0.62
ãĤ¼ãĤ¦ãĤ¹
-0.62
anth
-0.59
ous
-0.59
NetMessage
-0.59
anos
-0.59
POSITIVE LOGITS
avoided
1.25
ashamed
1.19
applauded
1.14
judged
1.10
treated
1.08
regarded
1.08
able
1.07
congratulated
1.03
viewed
1.03
considered
1.00
Activations Density 0.098%