INDEX
Explanations
expressions related to emotions and relationships, particularly love and gratitude
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
303
+0.07
0.2%
1100
+0.06
0.2%
82
+0.06
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
204
+0.07
0.02
1385
+0.06
0.03
1137
+0.06
0.03
Negative Logits
ⓧ
-0.97
<bos>
-0.86
LEncoder
-0.71
<?
-0.70
memoized
-0.70
EndContext
-0.68
TargetException
-0.68
Về
-0.67
execSQL
-0.67
</thead>
-0.66
POSITIVE LOGITS
affor
1.73
impra
1.63
quoique
1.62
increa
1.58
maneu
1.58
wherea
1.56
strick
1.56
guarante
1.52
Juf
1.49
shenan
1.45
Activations Density 0.257%