INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     became
    -1.05
    became
    -0.98
     Became
    -0.94
     maintains
    -0.68
     gave
    -0.64
     was
    -0.60
    成为了
    -0.60
     came
    -0.59
     went
    -0.59
     wurde
    -0.57
    POSITIVE LOGITS
     myſelf
    1.09
     pleaſure
    1.07
     Majefty
    1.06
     purpoſe
    1.02
     ſtate
    0.93
     cauſe
    0.93
     Efq
    0.90
     Monfieur
    0.88
     caufe
    0.88
     itſelf
    0.87
    Act Density 1.105%

    No Known Activations