INDEX
Explanations
phrases related to public actions or gatherings, especially those involving voting or petitions
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.14
3:0.16
4:0.10
5:0.06
6:0.04
7:0.05
8:0.05
9:0.10
10:0.15
11:0.06
Negative Logits
ONSORED
-1.43
)).
-1.35
Attributes
-1.30
ausp
-1.28
sill
-1.25
interviewer
-1.23
myster
-1.21
sole
-1.21
────
-1.18
enterprise
-1.15
POSITIVE LOGITS
joice
1.60
›
1.59
utterstock
1.55
rieving
1.53
downed
1.45
unning
1.35
ustomed
1.33
rejoice
1.31
aste
1.24
icas
1.23
Activations Density 0.429%