INDEX
    Explanations

    phrases that express defiance against social norms and challenges to stereotypes

    New Auto-Interp
    Negative Logits
    ysl
    -0.16
    638
    -0.15
     Nobel
    -0.15
    Backing
    -0.15
    amerate
    -0.14
    訴
    -0.13
     à¤Ĩत
    -0.13
    armac
    -0.13
    ÃŃses
    -0.12
    constitutional
    -0.12
    POSITIVE LOGITS
     convention
    0.38
     conventions
    0.35
     conventional
    0.33
     established
    0.33
     accepted
    0.31
     norms
    0.30
     expectations
    0.29
     orth
    0.28
     Convention
    0.28
     establishment
    0.27
    Act Density 0.260%

    No Known Activations