INDEX
Explanations
names and titles related to academia
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
168
+0.14
0.8%
1942
+0.13
0.8%
1691
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
168
+0.14
0.04
1942
+0.13
0.03
1691
+0.12
0.03
Negative Logits
<bos>
-2.23
Đặc
-0.70
дописавши
-0.69
Ngoài
-0.64
sự
-0.62
HideFlags
-0.62
autorytatywna
-0.61
Màu
-0.61
HasColumnType
-0.59
mặt
-0.59
POSITIVE LOGITS
Ph
1.58
Ph
1.40
ph
1.39
Phas
1.15
ph
1.10
philips
1.10
phat
1.08
PH
1.05
Meksi
1.04
tph
1.03
Activations Density 0.185%