INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     mura
    -0.08
     tulad
    -0.08
     conveniently
    -0.08
    ifen
    -0.08
     Door
    -0.07
     Vas
    -0.07
     Nas
    -0.07
     kettle
    -0.07
     HAD
    -0.07
    nor
    -0.07
    POSITIVE LOGITS
    ире
    0.08
    0.08
     بحث
    0.08
    mv
    0.08
     ул
    0.08
     primordial
    0.08
     freedoms
    0.08
     allegations
    0.08
     pupils
    0.07
    ादेश
    0.07
    Act Density 0.003%

    No Known Activations