INDEX
Explanations
words related to prevention and preventive actions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.26
1.5%
130
+0.11
0.6%
492
+0.09
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
130
+0.26
0.03
100
+0.11
0.02
1233
+0.09
0.02
Negative Logits
<bos>
-2.88
-0.79
<?
-0.76
ⓧ
-0.73
<?
-0.66
/**
-0.59
class
-0.59
HasIndex
-0.58
/***
-0.58
PrintStream
-0.57
POSITIVE LOGITS
Minang
1.41
Juf
1.38
jaya
1.37
saar
1.35
lele
1.33
Banjar
1.31
bandung
1.30
hcm
1.30
kyo
1.29
aen
1.26
Activations Density 0.064%