INDEX
    Explanations

    expressions related to societal expectations and personal identity

    New Auto-Interp
    Negative Logits
    ibs
    -0.18
    mas
    -0.18
    rl
    -0.16
    maf
    -0.15
    ainer
    -0.14
    ible
    -0.14
    uan
    -0.14
    408
    -0.14
    ronics
    -0.14
    aporation
    -0.14
    POSITIVE LOGITS
     everybody
    0.21
    everyone
    0.20
     everyone
    0.20
     Everyone
    0.18
     specific
    0.17
     individual
    0.17
     Everybody
    0.17
    æľĢæĸ°
    0.16
    ione
    0.16
     individ
    0.16
    Act Density 0.009%

    No Known Activations