INDEX
    Explanations

    punctuation and grammatical elements in the text

    New Auto-Interp
    Negative Logits
     Conv
    -0.16
    иб
    -0.16
    Conv
    -0.16
    èĢ
    -0.15
    idor
    -0.15
    leyen
    -0.15
    uvo
    -0.15
    inan
    -0.15
    mse
    -0.14
    çĩ
    -0.14
    POSITIVE LOGITS
    imoto
    0.16
    apol
    0.16
    ieren
    0.16
     strictly
    0.15
    oses
    0.15
    ognito
    0.15
    089
    0.14
    anta
    0.14
    Strict
    0.14
    ition
    0.14
    Act Density 0.002%

    No Known Activations