INDEX
Explanations
information about a person's background and life events
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.36
1.7%
752
+0.14
0.7%
1177
+0.09
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
752
+0.36
0.15
605
+0.14
0.07
897
+0.09
0.12
Negative Logits
<bos>
-2.57
ⓧ
-0.97
<?
-0.83
/**
-0.76
-0.73
continue
-0.71
/*
-0.70
got
-0.67
put
-0.67
strive
-0.67
POSITIVE LOGITS
soulign
1.89
Juf
1.82
vété
1.75
véhic
1.73
Keny
1.65
écout
1.64
délib
1.62
accla
1.57
clô
1.57
considér
1.55
Activations Density 2.350%