INDEX
    Explanations

    references to political ideologies and their implications in society

    New Auto-Interp
    Negative Logits
    į¼
    -0.13
    /API
    -0.13
    segue
    -0.13
    aste
    -0.12
    渡
    -0.12
    ABCDEFG
    -0.12
    ŃIJ
    -0.12
    omal
    -0.12
    İĺìĿ´
    -0.12
    еле
    -0.12
    POSITIVE LOGITS
     in
    0.51
    åľ¨
    0.34
     în
    0.34
    à¹ĥà¸Ļ
    0.31
     ÙģÙĬ
    0.30
     åľ¨
    0.29
    InThe
    0.26
     در
    0.24
     In
    0.24
     ÙģÙī
    0.23
    Act Density 0.144%

    No Known Activations