INDEX
Explanations
proper nouns and names associated with politicians or political entities
New Auto-Interp
Head Attr Weights
0:0.03
1:0.01
2:0.06
3:0.06
4:0.06
5:0.03
6:0.43
7:0.04
8:0.04
9:0.06
10:0.07
11:0.05
Negative Logits
MSN
-1.52
ERY
-1.34
icides
-1.34
anwhile
-1.27
tumblr
-1.25
abouts
-1.25
nels
-1.24
conn
-1.22
BUG
-1.21
MX
-1.18
POSITIVE LOGITS
lique
1.50
��
1.30
ciating
1.27
ample
1.19
VIDIA
1.17
bery
1.16
agre
1.16
bender
1.16
pson
1.11
rall
1.11
Activations Density 0.001%