INDEX
Explanations
appreciation and positive comments
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.19
1.0%
1065
+0.14
0.8%
1334
+0.14
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1065
+0.19
0.04
1334
+0.14
0.04
971
+0.14
0.03
Negative Logits
<bos>
-2.99
ⓧ
-1.03
/***
-1.00
/*!
-0.83
-0.83
<?
-0.81
endow
-0.73
/**
-0.71
///**
-0.71
/**
-0.68
POSITIVE LOGITS
lele
0.94
Minang
0.91
saar
0.90
Definitely
0.88
Definitely
0.87
territo
0.87
bandung
0.84
thuy
0.83
tawan
0.83
ados
0.82
Activations Density 0.140%