INDEX
Explanations
mentions of relationships with daughters, specifically interactions and experiences with daughters
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1937
+0.16
0.5%
871
+0.13
0.4%
1872
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1937
+0.16
0.03
1872
+0.13
0.02
401
+0.11
0.02
Negative Logits
trovo
-0.60
brille
-0.52
vanta
-0.50
trovi
-0.49
manca
-0.48
tende
-0.47
bó
-0.47
scopri
-0.47
interessa
-0.46
bisogna
-0.46
POSITIVE LOGITS
daughter
1.20
daughters
1.14
Daughter
1.13
daughter
1.10
Daughter
1.09
Daughters
0.95
daughters
0.92
McLaugh
0.79
UGHTER
0.78
aughter
0.75
Activations Density 0.047%