INDEX
Explanations
references to hyperlinks
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
256
+0.17
1.0%
260
+0.14
0.8%
472
+0.13
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
260
+0.17
0.03
256
+0.14
0.03
329
+0.13
0.02
Negative Logits
Ļ
-2.21
²
-2.03
ĥ½
-1.87
¨
-1.83
»¿
-1.83
ľĵ
-1.80
ÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤÃĥÃĤ
-1.73
§
-1.71
<|outofrange|>
-1.66
-1.66
POSITIVE LOGITS
age
2.21
ages
1.98
up
1.98
points
1.86
marks
1.83
ings
1.80
back
1.79
hammer
1.77
aments
1.77
point
1.75
Activations Density 0.104%