INDEX
    Explanations

    pronouns referring to oneself

    New Auto-Interp
    Negative Logits
     flags
    -0.63
     recycled
    -0.62
     reused
    -0.61
     stripes
    -0.60
     Scarlet
    -0.60
     Flags
    -0.59
     brackets
    -0.59
     instances
    -0.59
     primitive
    -0.58
     franchises
    -0.57
    POSITIVE LOGITS
    me
    4.23
    mes
    2.17
    ME
    2.05
    Me
    1.92
     ME
    1.49
    med
    1.48
     Me
    1.45
    mine
    1.45
    my
    1.40
    ming
    1.37
    Act Density 0.012%

    No Known Activations