INDEX
    Explanations

    punctuation marks, particularly periods and commas

    New Auto-Interp
    Negative Logits
    olt
    -0.17
    ucci
    -0.16
    год
    -0.16
    lett
    -0.15
    TRS
    -0.14
    esini
    -0.14
    ستÛĮ
    -0.14
    /npm
    -0.14
    487
    -0.14
    eling
    -0.14
    POSITIVE LOGITS
    bah
    0.16
     Ster
    0.15
    ikhail
    0.15
    sdale
    0.15
    onces
    0.15
    abez
    0.15
    oba
    0.14
    èİ
    0.14
    ohana
    0.14
    uni
    0.14
    Act Density 0.004%

    No Known Activations