INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     evangel
    -0.06
    ažd
    -0.06
     елем
    -0.06
    ậu
    -0.06
    -0.06
    -0.06
    gt
    -0.06
    xd
    -0.06
     Rick
    -0.06
    کرد
    -0.06
    POSITIVE LOGITS
     tabel
    0.07
    venida
    0.06
    _movies
    0.06
     Images
    0.06
    ecera
    0.06
    0.06
     Frontier
    0.05
     churn
    0.05
     tweeting
    0.05
     зміст
    0.05
    Act Density 0.851%

    No Known Activations