INDEX
    Explanations

    references to racial and ethnic identity

    New Auto-Interp
    Negative Logits
    .cf
    -0.14
    askan
    -0.14
    ocale
    -0.14
    ilon
    -0.14
     Leer
    -0.14
    иÑģÑĤÑĢа
    -0.13
    è¹
    -0.13
     thá»ķ
    -0.13
    ickle
    -0.13
    Ø®ÙĪ
    -0.12
    POSITIVE LOGITS
     look
    1.16
     looks
    1.12
    look
    1.00
     Look
    0.98
     looked
    0.97
     LOOK
    0.97
    looks
    0.95
     Looks
    0.93
    Look
    0.91
    _look
    0.88
    Act Density 0.649%

    No Known Activations