INDEX
    Explanations

    punctuations or markers often used in written language

    New Auto-Interp
    Negative Logits
    }$​
    -1.13
     Efq
    -0.96
     ―――――
    -0.94
    NUMX
    -0.90
     photolibrary
    -0.87
    клопе
    -0.87
    ^(@)
    -0.87
    esterday
    -0.86
    ✭✭
    -0.86
    ---*/
    -0.85
    POSITIVE LOGITS
     They
    1.07
    0.93
    "
    0.92
    They
    0.91
    )
    0.90
    ↵↵
    0.86
    .
    0.84
     I
    0.83
     It
    0.82
    ),
    0.81
    Act Density 0.963%

    No Known Activations