INDEX
Explanations
phrases related to political commentary and reactions
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.04
3:0.04
4:0.03
5:0.03
6:0.03
7:0.43
8:0.02
9:0.04
10:0.13
11:0.12
Negative Logits
vable
-1.51
competition
-1.51
relegation
-1.50
aisle
-1.46
bargaining
-1.42
basket
-1.42
negoti
-1.41
paramedics
-1.37
bidder
-1.36
orchestra
-1.35
POSITIVE LOGITS
Tweet
1.77
warning
1.70
tweets
1.66
javascript
1.59
weet
1.59
ティ
1.58
message
1.55
delete
1.53
Hello
1.53
1.51
Activations Density 0.042%