INDEX
    Explanations

    expressions of blame or criticism towards individuals and groups

    New Auto-Interp
    Head Attr Weights
    0:0.04
    1:0.03
    2:0.16
    3:0.05
    4:0.14
    5:0.06
    6:0.03
    7:0.02
    8:0.21
    9:0.12
    10:0.07
    11:0.03
    Negative Logits
    emouth
    -1.43
     IPM
    -1.40
    ��
    -1.34
     tion
    -1.28
     spanning
    -1.28
    ispers
    -1.27
    skilled
    -1.26
    direction
    -1.22
     endeavour
    -1.21
     encount
    -1.20
    POSITIVE LOGITS
     Trayvon
    1.61
    gio
    1.41
     Guilty
    1.41
    gins
    1.38
    uer
    1.29
     Madden
    1.27
     Schwarz
    1.26
     Shame
    1.25
     Rah
    1.24
     Moments
    1.24
    Act Density 0.007%

    No Known Activations