INDEX
Explanations
information about academic and research institutions, policies, and events
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
752
+0.13
0.4%
678
+0.13
0.4%
1150
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
16
+0.13
0.06
678
+0.13
0.05
474
+0.12
0.04
Negative Logits
gusted
-0.63
tazas
-0.61
<bos>
-0.59
extField
-0.55
參考文獻
-0.53
lccccc
-0.53
gruntled
-0.52
astrous
-0.52
׃
-0.51
])))
-0.51
POSITIVE LOGITS
encomp
1.16
intersper
1.14
shenan
1.09
uninten
1.06
impra
1.05
vagu
1.04
unspeak
1.02
indescri
1.02
Intere
1.02
reluct
1.02
Activations Density 0.296%