INDEX
    Explanations

    instances of the pronoun "we."

    instances of the word "we."

    New Auto-Interp
    Negative Logits
    him
    -0.76
    ragon
    -0.62
    yna
    -0.58
    emon
    -0.55
    incial
    -0.54
    ensed
    -0.53
    igun
    -0.52
    Override
    -0.52
    ounding
    -0.51
     him
    -0.50
    POSITIVE LOGITS
     we
    2.67
    we
    1.75
    We
    1.71
     We
    1.67
     our
    1.52
     ourselves
    1.51
     WE
    1.40
     ours
    1.39
    Our
    1.24
     us
    1.18
    Act Density 0.170%

    No Known Activations