INDEX
Explanations
information related to violent or tragic events
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.22
0.9%
2019
+0.11
0.4%
1150
+0.08
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2019
+0.22
0.09
381
+0.11
0.03
1150
+0.08
0.03
Negative Logits
<bos>
-1.97
ⓧ
-0.99
/***
-0.84
-0.82
/**
-0.73
///**
-0.72
/*
-0.70
<?
-0.63
chtenstein
-0.60
<!--
-0.60
POSITIVE LOGITS
soulign
1.33
accla
1.29
Keny
1.28
affor
1.24
Juf
1.23
maneu
1.23
véhic
1.22
impra
1.22
increa
1.21
Khart
1.20
Activations Density 0.749%