INDEX
    Explanations

    instances of the word "we" in various contexts, indicating a focus on collective identity or shared experiences

    New Auto-Interp
    Negative Logits
    ctor
    -0.18
    cf
    -0.15
    cala
    -0.15
     beforeSend
    -0.15
    rog
    -0.14
    ree
    -0.14
    aye
    -0.14
    sh
    -0.14
    hawk
    -0.14
    g
    -0.13
    POSITIVE LOGITS
    eping
    0.21
    athers
    0.18
    bservice
    0.17
    ’re
    0.17
    ’ll
    0.16
    eding
    0.16
    ’ve
    0.16
    've
    0.15
    eds
    0.15
    ights
    0.15
    Act Density 0.291%

    No Known Activations