INDEX
Explanations
punctuation marks and formatting in the text
New Auto-Interp
Negative Logits
aylight
-0.17
uby
-0.15
antino
-0.14
éry
-0.13
riteln
-0.13
ãĢĤãģĬ
-0.13
γά
-0.12
ekyll
-0.12
Silva
-0.12
OfSize
-0.12
POSITIVE LOGITS
``
0.31
``
0.23
So
0.18
``↵
0.18
BT
0.17
Cave
0.17
You
0.17
Æ¡
0.17
Q
0.16
so
0.16
Activations Density 0.072%