INDEX
Explanations
testimonies or statements related to social issues or controversies
New Auto-Interp
Head Attr Weights
0:0.04
1:0.02
2:0.03
3:0.06
4:0.43
5:0.04
6:0.08
7:0.06
8:0.03
9:0.04
10:0.06
11:0.06
Negative Logits
paralle
-3.02
conclud
-2.79
compr
-2.65
Conclusion
-2.64
ebook
-2.47
subconscious
-2.40
Pattern
-2.35
Completed
-2.30
CONCLUS
-2.30
Magikarp
-2.29
POSITIVE LOGITS
️
4.00
@
3.71
pic
3.54
https
3.49
Breitbart
3.41
tweeted
3.27
RT
3.24
�
3.23
(@
3.21
�
3.18
Activations Density 0.490%