INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     nouvelle
    0.38
     लेखन
    0.37
     nouvelles
    0.36
     تاریخ
    0.36
     intercultural
    0.36
     altre
    0.35
     Šk
    0.35
    0.34
     spiritual
    0.34
     çalışmalar
    0.34
    POSITIVE LOGITS
     defraud
    0.45
     avoid
    0.44
     overpriced
    0.39
    avoid
    0.39
     fraudulently
    0.38
     menghindari
    0.38
     disinfect
    0.38
     मुफ्त
    0.38
    َوْ
    0.37
     sacrificed
    0.36
    Act Density 0.002%

    No Known Activations