INDEX
Explanations
terms related to classifications and regulations in various contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
255
+0.15
0.8%
369
+0.15
0.8%
294
+0.14
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
198
+0.15
0.06
255
+0.15
0.06
135
+0.14
0.06
Negative Logits
omy
-1.81
fulness
-1.78
ftware
-1.73
ptions
-1.68
rele
-1.52
abil
-1.51
culus
-1.46
acchar
-1.45
vereign
-1.43
apine
-1.41
POSITIVE LOGITS
ĻĤ
4.25
²
4.18
¬
4.00
↵↵
3.96
↵
3.96
↵
3.96
<|outofrange|>
3.96
3.96
3.96
3.96
Activations Density 0.810%