INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     hefur
    -0.10
     periódico
    -0.08
     pecado
    -0.08
    િર
    -0.08
    (sr
    -0.08
     snel
    -0.08
     sinful
    -0.08
    )._
    -0.07
     perigos
    -0.07
    isie
    -0.07
    POSITIVE LOGITS
     idx
    0.08
    aret
    0.08
    ignore
    0.08
     enumerate
    0.07
    tip
    0.07
    ayon
    0.07
    _index
    0.07
    wọn
    0.07
     பட்ட
    0.07
    Index
    0.07
    Act Density 0.002%

    No Known Activations