INDEX
    Explanations

    themes related to identity and social belonging

    New Auto-Interp
    Negative Logits
    FLAG
    -0.16
     flagged
    -0.14
    ih
    -0.14
     discrim
    -0.14
    mitt
    -0.14
    hiba
    -0.14
     mineral
    -0.14
     rootReducer
    -0.13
    omez
    -0.13
     Flag
    -0.13
    POSITIVE LOGITS
     peer
    0.25
     Peer
    0.22
    peer
    0.22
    Peer
    0.21
    -peer
    0.21
    Pressure
    0.19
     Pressure
    0.18
     herd
    0.18
     conformity
    0.18
     pressure
    0.18
    Act Density 0.123%

    No Known Activations