INDEX
    Explanations

    which where

    New Auto-Interp
    Negative Logits
    -0.08
    اطر
    -0.07
    people
    -0.07
     domest
    -0.07
    شنا
    -0.07
     ingr
    -0.07
     inherent
    -0.07
    empt
    -0.07
    -0.07
    Interview
    -0.07
    POSITIVE LOGITS
     Versa
    0.08
     Acting
    0.08
     Christus
    0.07
    0.07
    olare
    0.07
    Natürlich
    0.07
    /j
    0.07
     acaso
    0.07
    _views
    0.07
     Muslim
    0.07
    Act Density 0.001%

    No Known Activations