INDEX
Explanations
historical and biographical information
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
198
+0.10
0.3%
581
+0.08
0.2%
286
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
46
+0.10
0.07
1876
+0.08
0.04
1948
+0.08
0.06
Negative Logits
__':
-0.69
Tikang
-0.63
__":
-0.56
└──
-0.55
tía
-0.53
Dichter
-0.53
dovrebbero
-0.52
puder
-0.51
Manbalar
-0.51
|};
-0.51
POSITIVE LOGITS
such
1.07
including
1.02
such
0.97
hairc
0.96
cushi
0.91
ecru
0.88
including
0.85
namely
0.84
Including
0.84
Namely
0.81
Activations Density 0.802%