INDEX
Explanations
phrases related to regime change and political upheaval
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.08
3:0.11
4:0.02
5:0.04
6:0.15
7:0.27
8:0.07
9:0.03
10:0.06
11:0.08
Negative Logits
龍�
-1.22
chrom
-1.09
ournals
-1.07
empath
-1.05
doi
-1.03
Chrom
-1.00
akespeare
-1.00
salon
-1.00
Collider
-1.00
nutrit
-0.99
POSITIVE LOGITS
ć
1.55
CVE
1.40
knife
1.31
lords
1.23
ogun
1.17
numbering
1.14
SPONSORED
1.13
Naz
1.12
rahim
1.11
Commands
1.08
Activations Density 0.008%