INDEX
Explanations
references to continental concepts or settings
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
362
+0.17
1.0%
156
+0.14
0.8%
444
+0.14
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
362
+0.17
0.01
429
+0.14
0.01
180
+0.14
0.01
Negative Logits
ellow
-1.55
ted
-1.50
tear
-1.47
inous
-1.46
nearly
-1.44
promise
-1.41
?>
-1.37
tons
-1.36
-)
-1.34
oped
-1.33
POSITIVE LOGITS
Īĺ
2.74
©
2.73
µ
2.63
¡
2.61
´
2.60
»¿
2.60
ł
2.50
·¸
2.42
£
2.37
»
2.37
Activations Density 0.027%