INDEX
    Explanations

    pronouns and phrases related to collaborative actions

    the phrase "We" indicating collective intentions or actions

    New Auto-Interp
    Negative Logits
     Unicorn
    -0.64
    cum
    -0.63
    personal
    -0.60
    imum
    -0.60
    Ore
    -0.60
    REDACTED
    -0.59
    aunt
    -0.58
    forms
    -0.57
     Rarity
    -0.56
    mund
    -0.56
    POSITIVE LOGITS
    're
    1.41
    've
    1.35
    'll
    1.16
    ird
    1.09
    bsite
    1.08
     gotta
    1.03
    athered
    1.02
     ourselves
    1.00
    ldon
    0.99
    akening
    0.99
    Act Density 0.205%

    No Known Activations