INDEX
Explanations
phrases containing the possessive pronoun "my" followed by numbers
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1741
+0.20
0.7%
1978
+0.14
0.5%
1984
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1937
+0.20
0.06
1622
+0.14
0.04
1256
+0.13
0.05
Negative Logits
milf
-1.26
peppa
-1.19
effe
-1.09
madonna
-1.09
fuf
-1.08
excru
-1.08
inappro
-1.07
hentai
-1.06
erad
-1.06
jojo
-1.04
POSITIVE LOGITS
my
1.13
<bos>
1.11
my
1.02
My
0.92
myself
0.88
My
0.86
MY
0.81
own
0.81
minha
0.77
MY
0.77
Activations Density 0.159%