INDEX
Explanations
references to social media platforms and protests
New Auto-Interp
Head Attr Weights
0:0.01
1:0.02
2:0.11
3:0.06
4:0.35
5:0.02
6:0.04
7:0.19
8:0.02
9:0.03
10:0.06
11:0.04
Negative Logits
acqu
-1.49
iliated
-1.47
Keeper
-1.47
ificantly
-1.41
Rot
-1.38
skilled
-1.37
depended
-1.31
ensed
-1.29
ummies
-1.27
ivably
-1.27
POSITIVE LOGITS
ンジ
1.48
Speedway
1.46
antry
1.45
CES
1.38
arity
1.37
ッ
1.36
aisle
1.36
Country
1.36
EVA
1.36
actionGroup
1.33
Activations Density 0.006%