INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    endor
    -0.07
    پي
    -0.07
    -controller
    -0.07
     aptitude
    -0.07
    ثمان
    -0.06
    ென
    -0.06
    'or
    -0.06
    àm
    -0.06
     ọsọ
    -0.06
     transmitter
    -0.06
    POSITIVE LOGITS
     bath
    0.09
     Bath
    0.09
     баць
    0.08
    0.08
     baths
    0.08
    ћа
    0.08
    насць
    0.08
     famil
    0.08
    laap
    0.07
     жосп
    0.07
    Act Density 0.004%

    No Known Activations