INDEX
    Explanations

    phrases emphasizing connections and interactions with others

    New Auto-Interp
    Negative Logits
    etten
    -0.16
    elah
    -0.16
    rof
    -0.15
    -pills
    -0.15
    AZY
    -0.15
    fan
    -0.14
    ester
    -0.14
    OPSIS
    -0.14
    PROFILE
    -0.14
    _observer
    -0.14
    POSITIVE LOGITS
     nhau
    0.28
     other
    0.24
     people
    0.20
     different
    0.19
     others
    0.18
    /about
    0.18
    other
    0.18
     them
    0.18
     him
    0.18
     lik
    0.17
    Act Density 0.245%

    No Known Activations