INDEX
Explanations
references to scientific articles or research citations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
111
+0.13
0.7%
310
+0.11
0.6%
148
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
111
+0.13
0.01
469
+0.11
0.00
339
+0.11
0.00
Negative Logits
corresponding
-1.63
computed
-1.61
cross
-1.55
izer
-1.54
predicted
-1.53
percentage
-1.50
uptake
-1.47
converted
-1.47
value
-1.46
equivalent
-1.45
POSITIVE LOGITS
»¿
2.75
ĻĤ
2.17
Īĺ
2.15
¥
2.03
¤
2.00
į
1.96
Ĩ
1.95
ĩ
1.84
rums
1.84
¬
1.81
Activations Density 0.090%