INDEX
Explanations
references to language proficiency and translation
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.32
1.1%
198
+0.11
0.4%
1150
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
198
+0.32
0.04
1929
+0.11
0.04
1714
+0.09
0.03
Negative Logits
<bos>
-1.78
DockStyle
-0.71
+#+#
-0.66
isContained
-0.60
mapreduce
-0.59
íí
-0.58
bewerken
-0.57
setDo
-0.57
GORITH
-0.57
RuleContext
-0.57
POSITIVE LOGITS
affor
1.53
véhic
1.47
maneu
1.43
increa
1.40
Juf
1.35
reluct
1.34
accla
1.32
beverly
1.32
chrysler
1.32
carrefour
1.30
Activations Density 0.298%