INDEX
Explanations
website URLs and promotions related to pets
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1984
+0.15
0.5%
1741
+0.15
0.5%
1385
+0.14
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1984
+0.15
0.07
2004
+0.15
0.06
1937
+0.14
0.06
Negative Logits
milf
-0.92
madonna
-0.85
hentai
-0.84
peppa
-0.82
pegasus
-0.79
jojo
-0.78
malheureux
-0.78
pazi
-0.78
embra
-0.78
fuf
-0.77
POSITIVE LOGITS
your
1.07
your
1.02
Your
0.99
Your
0.98
<bos>
0.96
YOUR
0.95
YOUR
0.91
yourself
0.86
own
0.82
votre
0.80
Activations Density 0.167%