INDEX
    Explanations

    repetitive usage of the word "I."

    New Auto-Interp
    Negative Logits
     Efq
    -1.29
     ―――――
    -1.26
     itſelf
    -1.25
     faſt
    -1.14
    ſelves
    -1.14
     ་་
    -1.13
     iſt
    -1.12
     Monfieur
    -1.10
     ſeveral
    -1.09
     myſelf
    -1.08
    POSITIVE LOGITS
     it
    1.13
     I
    1.09
     he
    1.09
     we
    1.08
      
    0.84
     she
    0.84
     you
    0.83
    ↵↵
    0.79
    we
    0.78
    it
    0.77
    Act Density 0.173%

    No Known Activations