INDEX
Explanations
phrases related to a specific controversial viewpoint on having children
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.09
0.3%
836
+0.07
0.2%
2008
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
836
+0.09
0.02
709
+0.07
0.02
1207
+0.07
0.02
Negative Logits
Gorb
-0.70
Rine
-0.57
Hecht
-0.52
kasa
-0.52
Knud
-0.51
Barbier
-0.51
$-$\\
-0.50
Hildebrand
-0.50
↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵
-0.50
Schenk
-0.49
POSITIVE LOGITS
paradiso
0.63
further
0.56
RenderAtEndOf
0.54
compleanno
0.52
furt
0.51
FURTHER
0.51
leggings
0.51
bacio
0.51
divertimento
0.49
step
0.49
Activations Density 0.165%