INDEX
    Explanations

    mentions of political figures and their actions related to social issues

    New Auto-Interp
    Negative Logits
     addCriterion
    -0.16
    endale
    -0.16
    æ³ī
    -0.15
    249
    -0.15
     \↵
    -0.14
    icontrol
    -0.14
    ktor
    -0.14
    \↵
    -0.14
    otor
    -0.14
    stÃŃ
    -0.14
    POSITIVE LOGITS
    ouch
    0.15
    aliz
    0.14
    &o
    0.14
     bracket
    0.14
     Enumerator
    0.14
    oodoo
    0.14
    vel
    0.13
    omin
    0.13
     Duch
    0.13
    rad
    0.13
    Act Density 0.491%

    No Known Activations