INDEX
    Explanations

    specific symbols or characters

    symbols or characters that represent intensity or urgency

    New Auto-Interp
    Negative Logits
     Tob
    -0.73
     Obi
    -0.69
     interf
    -0.68
     Bent
    -0.67
     transc
    -0.67
     phen
    -0.66
     Brist
    -0.65
     Tek
    -0.65
     bes
    -0.64
     Sok
    -0.64
    POSITIVE LOGITS
    Ļ
    1.78
    ¬
    1.34
    ª
    1.32
    ¡
    1.26
    ı
    1.24
    ħ
    1.24
    į
    1.20
    «
    1.18
    µ
    1.16
    Ń
    1.16
    Act Density 0.450%

    No Known Activations