INDEX
    Explanations

    references to political figures and their perceived characteristics or actions

    New Auto-Interp
    Negative Logits
    å¢ĥ
    -0.16
    IFn
    -0.15
    ');");↵
    -0.15
    oleÄį
    -0.15
    ventions
    -0.15
    üle
    -0.14
    ạng
    -0.14
    stery
    -0.14
    mploy
    -0.14
    ertainment
    -0.14
    POSITIVE LOGITS
     figure
    0.19
     moderate
    0.18
     outsider
    0.18
     abras
    0.17
     politician
    0.17
     polar
    0.17
     prote
    0.17
     Rhodes
    0.17
     consum
    0.16
     cage
    0.16
    Act Density 0.174%

    No Known Activations