INDEX
Explanations
phrases related to tracking and monitoring
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.23
1.5%
687
+0.16
1.0%
479
+0.13
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
687
+0.23
0.03
1464
+0.16
0.03
479
+0.13
0.03
Negative Logits
<bos>
-2.81
else
-0.73
ൊ
-0.71
AssemblyCompany
-0.68
//
-0.67
հղումներ
-0.65
ComponentModel
-0.64
-0.63
<thead>
-0.63
util
-0.62
POSITIVE LOGITS
increa
2.07
accla
2.04
affor
2.01
maneu
1.93
impra
1.92
strick
1.87
reluct
1.87
disagre
1.84
unspeak
1.84
inev
1.83
Activations Density 0.046%