INDEX
Explanations
phrases indicating a sense of loss or change over time
New Auto-Interp
Negative Logits
-0.66
↵
-0.65
↵↵
-0.61
“
-0.60
multiple
-0.58
"
-0.57
alternative
-0.57
本
-0.54
-0.54
the
-0.54
POSITIVE LOGITS
ujednoznacz
1.19
httphttps
1.16
⟬
1.05
<unused41>
1.05
<unused43>
1.05
<unused14>
1.04
<unused17>
1.04
<unused3>
1.04
<pad>
1.04
[@BOS@]
1.04
Activations Density 0.150%