INDEX
Explanations
complex or technical terms
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
597
+0.16
0.6%
1896
+0.11
0.4%
82
+0.10
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
597
+0.16
0.03
124
+0.11
0.02
321
+0.10
0.03
Negative Logits
Siga
-0.55
Desen
-0.52
Filosof
-0.49
Significado
-0.49
Obra
-0.48
Quais
-0.46
História
-0.45
Nesta
-0.45
dataclass
-0.45
tslib
-0.45
POSITIVE LOGITS
exp
1.06
EXP
1.00
Exp
0.99
exp
0.94
Exp
0.92
EXP
0.88
expon
0.85
xp
0.83
exponents
0.81
popoli
0.81
Activations Density 0.138%