INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    anik
    -0.63
    yek
    -0.61
    yak
    -0.59
    beek
    -0.57
    eek
    -0.54
    jek
    -0.54
    e
    -0.54
    dek
    -0.53
    atk
    -0.53
    jak
    -0.52
    POSITIVE LOGITS
    })*/
    0.68
    os
    0.68
     يتيمه
    0.66
     itſelf
    0.66
    ness
    0.64
    ets
    0.63
    ies
    0.60
    🏾
    0.59
     Chriftian
    0.59
    ers
    0.58
    Act Density 0.132%

    No Known Activations