INDEX
    Explanations

    references to societal perceptions and critiques of leadership and authority

    New Auto-Interp
    Negative Logits
     Nom
    -0.17
     Hab
    -0.16
    å»Ĭ
    -0.14
    SystemService
    -0.14
     Ả
    -0.14
     ë§ĪìĿĮ
    -0.14
    รม
    -0.14
    Nom
    -0.13
    ismet
    -0.13
     nick
    -0.13
    POSITIVE LOGITS
     importance
    0.20
    sworth
    0.19
     significance
    0.18
    /tab
    0.17
     import
    0.17
     Importance
    0.16
     ÙħÙĤد
    0.16
     special
    0.16
     اÙĩÙħÛĮت
    0.15
    éĩįè¦ģ
    0.15
    Act Density 0.204%

    No Known Activations