INDEX
    Explanations

    phrases related to political figures and establishments

    references to political affiliations and social structures within communities

    New Auto-Interp
    Negative Logits
    acly
    -0.67
    ãĥ¼ãĥĨ
    -0.66
    ãĤ¶
    -0.62
    Bul
    -0.62
    Pg
    -0.60
    ãĤ¯
    -0.59
    ãĤ¦ãĤ¹
    -0.59
     Incre
    -0.59
    é¾įå
    -0.59
    ãĤ¤
    -0.57
    POSITIVE LOGITS
     rejoice
    0.99
     recognise
    0.88
     unite
    0.88
     agrees
    0.83
     dared
    0.78
     reacted
    0.77
     accuse
    0.76
     teamed
    0.76
     raided
    0.76
     allege
    0.75
    Act Density 0.467%

    No Known Activations