INDEX
Explanations
terms related to military operations or historical events
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
394
+0.27
1.0%
137
+0.14
0.5%
1177
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
137
+0.27
0.12
394
+0.14
0.08
1784
+0.13
0.09
Negative Logits
<bos>
-0.97
featureID
-0.76
MessageTagHelper
-0.75
getWriter
-0.74
AndEndTag
-0.73
Попис
-0.73
Normdatei
-0.73
ValueStyle
-0.72
Wiktionnaire
-0.71
fvar
-0.68
POSITIVE LOGITS
reluct
2.10
increa
2.03
disagre
1.95
shenan
1.94
emphat
1.91
maneu
1.90
inev
1.89
encomp
1.89
affor
1.84
wherea
1.82
Activations Density 2.366%