INDEX
    Explanations

    statements of opinion and agreement regarding social issues

    New Auto-Interp
    Negative Logits
    elp
    -0.16
     pÅĻitom
    -0.15
    ippers
    -0.15
    uez
    -0.15
     citiz
    -0.14
    aco
    -0.14
    ills
    -0.14
    -divider
    -0.13
    ĵn
    -0.13
    uits
    -0.13
    POSITIVE LOGITS
     there
    0.31
     There
    0.23
     THERE
    0.20
    There
    0.20
    there
    0.19
     nobody
    0.18
    avic
    0.16
    anes
    0.15
     no
    0.15
    åĩ¡
    0.15
    Act Density 0.497%

    No Known Activations