INDEX
    Explanations

    addressing groups of people

    New Auto-Interp
    Negative Logits
     was
    1.09
     নিজেও
    0.95
     protégé
    0.95
     fabricant
    0.94
    ется
    0.94
     perpetrator
    0.90
     নিজে
    0.90
     wasn
    0.89
     নিজেই
    0.88
     був
    0.87
    POSITIVE LOGITS
    ı
    1.02
    s
    1.00
    hips
    0.95
     którzy
    0.94
    u
    0.88
    m
    0.87
    纷纷
    0.86
    ேத்க
    0.86
     جميعا
    0.84
    c
    0.84
    Act Density 0.192%

    No Known Activations