INDEX
Explanations
positive descriptions or evaluations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
122
+0.16
0.6%
553
+0.13
0.5%
1416
+0.12
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
122
+0.16
0.03
197
+0.13
0.02
1512
+0.12
0.02
Negative Logits
purcha
-0.95
affor
-0.94
depic
-0.94
increa
-0.92
indestru
-0.91
apprehen
-0.90
unden
-0.89
beaute
-0.86
encomp
-0.85
paula
-0.84
POSITIVE LOGITS
bright
1.35
bright
1.30
Bright
1.25
Bright
1.19
BRIGHT
1.13
brighter
1.13
brightness
1.08
brightest
1.08
BRIGHT
1.04
brighten
0.90
Activations Density 0.068%