INDEX
Explanations
mentions of jihadist groups and their activities
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.24
0.9%
1842
+0.13
0.5%
198
+0.10
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
446
+0.24
0.04
1842
+0.13
0.03
198
+0.10
0.05
Negative Logits
<bos>
-3.41
BeginContext
-0.69
createSprite
-0.68
IsMutable
-0.68
public
-0.68
ProtoMessage
-0.64
ivelany
-0.64
MockBean
-0.61
JspWriter
-0.60
rungsseite
-0.60
POSITIVE LOGITS
affor
1.77
unlaw
1.68
increa
1.62
unwarran
1.61
beaute
1.61
reluct
1.60
unspeak
1.60
tolerably
1.59
gaily
1.57
impra
1.52
Activations Density 0.316%