INDEX
Explanations
pronouns referring to people and their actions
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.10
3:0.06
4:0.14
5:0.06
6:0.21
7:0.06
8:0.11
9:0.05
10:0.07
11:0.07
Negative Logits
utenant
-1.58
xtap
-1.54
hern
-1.41
NetMessage
-1.41
Interstitial
-1.37
beware
-1.31
haps
-1.30
mon
-1.26
utor
-1.24
adder
-1.23
POSITIVE LOGITS
PLA
1.21
Ern
1.21
loved
1.14
Merit
1.13
fronts
1.11
Poké
1.10
debian
1.08
isec
1.08
�
1.07
kus
1.06
Activations Density 0.030%