INDEX
Explanations
adjectives and descriptions related to size, scale, and extremes
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
297
+0.10
0.3%
1385
+0.09
0.3%
1042
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
736
+0.10
0.07
908
+0.09
0.02
870
+0.09
0.04
Negative Logits
kram
-0.66
silang
-0.64
permu
-0.64
asfal
-0.59
raso
-0.59
ilang
-0.58
perif
-0.57
sarili
-0.56
dita
-0.56
kras
-0.55
POSITIVE LOGITS
size
1.04
bigger
0.96
larger
0.96
sizes
0.94
size
0.94
Size
0.92
Size
0.90
Bigger
0.89
sized
0.89
Larger
0.86
Activations Density 0.779%