INDEX
Explanations
instances of the word "example" followed by a number and potentially a comma and additional number
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.24
1.1%
1127
+0.10
0.5%
1637
+0.09
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1637
+0.24
0.03
1127
+0.10
0.03
361
+0.09
0.03
Negative Logits
<bos>
-3.00
/***
-0.99
ⓧ
-0.89
///**
-0.78
#![
-0.72
<?
-0.71
/*!
-0.69
<?
-0.68
//});
-0.67
Более
-0.63
POSITIVE LOGITS
quoc
1.12
ftu
1.11
maneu
1.11
kyo
1.09
umo
1.06
fta
1.06
bandung
1.06
aveug
1.05
guarante
1.04
véhic
1.04
Activations Density 0.121%