INDEX
    Explanations

    references to authority figures and leadership roles

    New Auto-Interp
    Negative Logits
    actly
    -0.14
    uellement
    -0.13
    áºŃm
    -0.13
    zelf
    -0.13
    -за
    -0.13
    asily
    -0.13
    atism
    -0.12
    conds
    -0.12
    icamente
    -0.12
    irtual
    -0.12
    POSITIVE LOGITS
    liest
    0.22
    aviest
    0.19
    iest
    0.18
    quirer
    0.16
    -too
    0.14
     creampie
    0.14
    osphere
    0.14
    DisplayStyle
    0.14
    .googleapis
    0.14
    niest
    0.14
    Act Density 2.302%

    No Known Activations