INDEX
Explanations
relationships and interactions between different individuals
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1150
+0.16
0.5%
2034
+0.16
0.5%
1535
+0.15
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
382
+0.16
0.10
1200
+0.16
0.07
1535
+0.15
0.07
Negative Logits
mef
-1.45
dises
-1.41
haup
-1.33
umo
-1.33
gonz
-1.33
seiz
-1.33
fordable
-1.31
canel
-1.30
hcm
-1.28
abnorm
-1.28
POSITIVE LOGITS
He
0.91
She
0.83
They
0.82
Thus
0.78
His
0.77
So
0.76
Therefore
0.75
<eos>
0.75
↵↵
0.74
But
0.74
Activations Density 0.500%