INDEX
    Explanations

    phrases related to racial considerations in political contexts

    New Auto-Interp
    Head Attr Weights
    0:0.07
    1:0.03
    2:0.02
    3:0.09
    4:0.05
    5:0.07
    6:0.04
    7:0.03
    8:0.43
    9:0.06
    10:0.02
    11:0.03
    Negative Logits
    ��
    -2.00
    lisher
    -1.87
    ��
    -1.84
    ��極
    -1.81
    cedented
    -1.67
     breakthrough
    -1.64
     Roose
    -1.58
    eks
    -1.58
    ende
    -1.57
     outputs
    -1.55
    POSITIVE LOGITS
    lane
    1.85
    Against
    1.82
     ethnicity
    1.79
    wcsstore
    1.70
    against
    1.66
     mattered
    1.65
     vanquished
    1.63
    ="#
    1.61
     innocence
    1.60
     jurors
    1.60
    Act Density 0.003%

    No Known Activations