INDEX
    Explanations

    references to cosmetic or physical attributes and their perceived societal impacts

    New Auto-Interp
    Negative Logits
    ÏģÏį
    -0.16
    ensburg
    -0.15
    дов
    -0.14
    ÑĸмÑĸ
    -0.14
    ilton
    -0.14
    digits
    -0.13
    rdf
    -0.13
     independ
    -0.13
    ymoon
    -0.13
    \CMS
    -0.13
    POSITIVE LOGITS
     problem
    0.24
    .problem
    0.23
    problem
    0.23
     Wor
    0.22
     Problem
    0.22
     worry
    0.22
    åķıé¡Į
    0.22
    éĹ®é¢ĺ
    0.21
     concern
    0.20
     problema
    0.20
    Act Density 0.266%

    No Known Activations