INDEX
Explanations
expressions of blame or criticism towards individuals and groups
New Auto-Interp
Head Attr Weights
0:0.04
1:0.03
2:0.16
3:0.05
4:0.14
5:0.06
6:0.03
7:0.02
8:0.21
9:0.12
10:0.07
11:0.03
Negative Logits
emouth
-1.43
IPM
-1.40
��
-1.34
tion
-1.28
spanning
-1.28
ispers
-1.27
skilled
-1.26
direction
-1.22
endeavour
-1.21
encount
-1.20
POSITIVE LOGITS
Trayvon
1.61
gio
1.41
Guilty
1.41
gins
1.38
uer
1.29
Madden
1.27
Schwarz
1.26
Shame
1.25
Rah
1.24
Moments
1.24
Activations Density 0.007%