INDEX
    Explanations

    references to authority figures or leadership roles

    New Auto-Interp
    Negative Logits
    hed
    -0.18
    éru
    -0.16
    adh
    -0.16
    ร
    -0.15
    ØŃت
    -0.14
    andum
    -0.14
    795
    -0.14
     vast
    -0.14
    adf
    -0.14
    ãĥ³ãĥĩãĤ£
    -0.14
    POSITIVE LOGITS
    anova
    0.24
    (es
    0.22
    ial
    0.20
    -worker
    0.19
    /exec
    0.19
    å¨ĺ
    0.19
    eldorf
    0.19
    iale
    0.18
    dom
    0.18
    iali
    0.17
    Act Density 0.028%

    No Known Activations