INDEX
    Explanations

    collective pronouns indicating inclusivity and shared experience

    New Auto-Interp
    Negative Logits
     themselves
    -0.21
    re
    -0.18
    d
    -0.17
    nya
    -0.17
    ne
    -0.17
    was
    -0.17
    m
    -0.17
    noon
    -0.16
    w
    -0.15
    ctor
    -0.15
    POSITIVE LOGITS
     ourselves
    0.45
     all
    0.31
    athers
    0.30
    aves
    0.30
    eping
    0.29
    brtc
    0.28
    blink
    0.27
    eding
    0.27
    asel
    0.26
    aved
    0.26
    Act Density 0.458%

    No Known Activations