INDEX
    Explanations

    educational and medical institutions

    New Auto-Interp
    Negative Logits
     Cas
    -0.07
     wall
    -0.07
    Man
    -0.07
     person
    -0.06
     gay
    -0.06
    Ma
    -0.06
     пти
    -0.06
     duck
    -0.06
    duck
    -0.06
     LO
    -0.06
    POSITIVE LOGITS
    肯定
    0.07
    orses
    0.06
     चक
    0.06
    реп
    0.06
    infer
    0.06
    0.06
    -conf
    0.06
    curity
    0.06
     неп
    0.06
     meilleurs
    0.06
    Act Density 0.052%

    No Known Activations