INDEX
    Explanations

    phrases associated with controversial or polarizing political debates and identities

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.01
    2:0.06
    3:0.33
    4:0.01
    5:0.02
    6:0.07
    7:0.11
    8:0.06
    9:0.11
    10:0.05
    11:0.09
    Negative Logits
    ngth
    -1.45
    ITNESS
    -1.43
    eele
    -1.33
    fty
    -1.30
    athered
    -1.27
    eller
    -1.26
    aternity
    -1.23
    ruary
    -1.21
     Creat
    -1.19
    ellar
    -1.17
    POSITIVE LOGITS
     buffalo
    1.18
     Bengal
    1.17
     zombies
    1.15
     psycho
    1.15
     overdose
    1.14
    adesh
    1.11
     altogether
    1.10
     offence
    1.09
     Sharks
    1.09
     heroin
    1.09
    Act Density 0.034%

    No Known Activations