INDEX
Explanations
file extensions or wildcard patterns
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
23
+0.25
1.4%
478
+0.16
0.9%
156
+0.14
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
504
+0.25
0.01
258
+0.16
0.01
510
+0.14
0.01
Negative Logits
·
-1.98
«
-1.79
Äį
-1.75
Ļ
-1.74
boats
-1.70
cad
-1.65
=”
-1.62
Ľ
-1.59
µ
-1.56
analyzer
-1.56
POSITIVE LOGITS
rapeutic
1.75
selves
1.65
sudden
1.56
tgz
1.52
new
1.49
leans
1.47
perate
1.46
comes
1.46
flesh
1.44
Protestant
1.44
Activations Density 0.006%