INDEX
Explanations
references to specific names or terms
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
549
+0.11
0.3%
1177
+0.10
0.3%
321
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
981
+0.11
0.11
549
+0.10
0.10
577
+0.09
0.08
Negative Logits
jgl
-0.55
뉘
-0.53
ẵng
-0.53
Πε
-0.50
ukone
-0.50
RTCK
-0.50
ElementException
-0.49
ScopeManager
-0.49
farmacia
-0.49
المراجع
-0.48
POSITIVE LOGITS
increa
1.36
fuf
1.36
intersper
1.31
encomp
1.29
fto
1.29
?...
1.28
affor
1.28
shenan
1.27
fortn
1.27
strick
1.27
Activations Density 0.639%