INDEX
Explanations
requests for assistance or guidance in achieving specific tasks
New Auto-Interp
Negative Logits
:✨
-0.80
<unused68>
-0.77
<unused14>
-0.77
<unused74>
-0.76
<unused41>
-0.76
<unused8>
-0.76
<unused3>
-0.76
<unused16>
-0.76
[@BOS@]
-0.76
<pad>
-0.76
POSITIVE LOGITS
is
0.57
includes
0.28
Allerdings
0.28
yakni
0.28
namely
0.28
yaitu
0.28
are
0.28
involves
0.27
appears
0.27
consists
0.27
Activations Density 0.083%