INDEX
Explanations
technical terms, such as specific software or web addresses
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
776
+0.13
0.5%
25
+0.13
0.4%
1145
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
25
+0.13
0.04
1339
+0.13
0.03
1145
+0.12
0.03
Negative Logits
mbggenerated
-0.58
żdy
-0.48
vrijwilli
-0.48
lccn
-0.44
mediate
-0.42
josh
-0.40
izophren
-0.39
décrire
-0.39
traducciones
-0.39
Dung
-0.39
POSITIVE LOGITS
special
1.17
Special
1.11
special
1.09
Special
1.07
SPECIAL
1.03
SPECIAL
0.99
specials
0.92
Especial
0.90
pecial
0.86
speciale
0.84
Activations Density 0.075%