INDEX
Explanations
instances of specific personal names or identifiers
New Auto-Interp
Head Attr Weights
0:0.03
1:0.02
2:0.08
3:0.26
4:0.03
5:0.03
6:0.11
7:0.09
8:0.05
9:0.09
10:0.06
11:0.08
Negative Logits
tumblr
-1.21
connection
-1.13
ovych
-1.08
umbai
-1.05
️
-1.05
usercontent
-1.05
connection
-1.04
nexus
-1.04
iston
-1.03
aeus
-1.01
POSITIVE LOGITS
士
1.18
�士
1.17
Lauder
1.16
BCC
1.15
pard
1.06
veto
1.05
infall
1.04
Rai
1.04
KT
1.03
RELEASE
1.01
Activations Density 0.006%