INDEX
Explanations
references to personal experiences and information
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.26
1.8%
1741
+0.12
0.9%
1984
+0.10
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1984
+0.26
0.09
144
+0.12
0.07
1187
+0.10
0.07
Negative Logits
<bos>
-2.72
/**
-0.87
/***
-0.79
ⓧ
-0.76
/*
-0.75
<!--
-0.73
<?
-0.73
///**
-0.72
-0.70
<?
-0.63
POSITIVE LOGITS
quoique
0.97
véhic
0.96
compréhen
0.94
soulign
0.92
getMy
0.90
montrant
0.90
désol
0.88
incroy
0.88
considér
0.85
déliv
0.84
Activations Density 0.237%