INDEX
Explanations
phrases related to family roles and relationships
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.22
0.8%
1081
+0.09
0.3%
1872
+0.08
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1081
+0.22
0.07
1060
+0.09
0.06
933
+0.08
0.05
Negative Logits
<bos>
-2.25
ⓧ
-0.76
/***
-0.70
///**
-0.62
/**
-0.61
/*
-0.57
<!--
-0.56
<?
-0.56
//}
-0.55
//};
-0.54
POSITIVE LOGITS
stockholm
1.30
Juf
1.26
maneu
1.25
véhic
1.25
strick
1.22
affor
1.20
madonna
1.18
Momb
1.16
riviera
1.16
Keny
1.15
Activations Density 0.642%