INDEX
    Explanations

    problematic and offensive terms

    New Auto-Interp
    Negative Logits
     каждой
    0.51
     Each
    0.44
     EACH
    0.40
    0.40
    0.39
     each
    0.39
    }=(-
    0.39
    Each
    0.38
     collectors
    0.38
    ко
    0.37
    POSITIVE LOGITS
    been
    0.46
     spearheaded
    0.46
     been
    0.45
     করেছে
    0.44
     rebranded
    0.43
     teknologi
    0.41
    www
    0.41
     தற்போது
    0.40
    နောက်
    0.40
     Blvd
    0.39
    Act Density 0.004%

    No Known Activations