INDEX
Explanations
strong, impactful, or vivid sensory descriptions and imagery
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1385
+0.14
0.5%
1325
+0.11
0.4%
25
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1805
+0.14
0.04
25
+0.11
0.04
624
+0.11
0.03
Negative Logits
OGND
-0.71
Ảnh
-0.70
Được
-0.64
Tại
-0.63
Phân
-0.63
[-\
-0.62
Đây
-0.62
Điện
-0.61
AllowUser
-0.60
Điều
-0.59
POSITIVE LOGITS
fta
1.60
Augu
1.50
ftu
1.48
thut
1.40
fup
1.38
fto
1.36
»>
1.33
feen
1.33
fays
1.32
miu
1.29
Activations Density 0.265%