INDEX
Head Attr Weights
0:0.06
1:0.09
2:0.09
3:0.07
4:0.09
5:0.07
6:0.10
7:0.09
8:0.06
9:0.06
10:0.09
11:0.08
Negative Logits
Enemy
-1.82
DonaldTrump
-1.82
��
-1.80
�
-1.76
Citizens
-1.75
Saunders
-1.70
Edwin
-1.70
ESSION
-1.69
Thief
-1.67
Cummings
-1.66
POSITIVE LOGITS
outweigh
1.97
pun
1.71
hull
1.68
clen
1.64
chlor
1.64
outwe
1.61
licens
1.58
moss
1.54
inhibitor
1.54
akespe
1.54
Activations Density 0.000%