INDEX
Explanations
list items in a structured format
New Auto-Interp
Negative Logits
utafitiHapana
-0.81
パンチラ
-0.71
-0.71
fashiola
-0.69
<unused14>
-0.69
<unused8>
-0.69
<unused41>
-0.69
<unused51>
-0.69
<pad>
-0.69
<unused1>
-0.69
POSITIVE LOGITS
:✨
0.55
<eos>
0.45
________________
0.40
↵↵
0.35
ویکیآمباردا
0.32
↵
0.32
<tr>
0.32
intptr
0.32
↵↵↵
0.32
LITERAL
0.30
Activations Density 0.000%