INDEX
    Explanations

    mathematical expressions and punctuation

    New Auto-Interp
    Negative Logits
    ","","
    -1.00
    zhen
    -0.98
     primero
    -0.96
    pso
    -0.94
     molte
    -0.93
     dettagli
    -0.92
     primeiro
    -0.90
    alers
    -0.89
    cohol
    -0.89
    だけではなく
    -0.88
    POSITIVE LOGITS
     and
    0.86
     коли
    0.84
     vilja
    0.83
    Lastly
    0.83
    کتاب
    0.82
     роз
    0.81
    řské
    0.79
     comédie
    0.79
     säga
    0.78
    ETRIC
    0.78
    Act Density 0.031%

    No Known Activations