INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     myſelf
    -2.08
     itſelf
    -2.03
     Shakspeare
    -1.94
     Monfieur
    -1.94
     Efq
    -1.88
     pleaſure
    -1.88
     Jefus
    -1.84
     Houſe
    -1.77
     Majefty
    -1.73
     Cæsar
    -1.72
    POSITIVE LOGITS
    au
    0.60
     La
    0.60
     G
    0.58
    os
    0.57
     Le
    0.57
     I
    0.57
     A
    0.56
     H
    0.56
    an
    0.55
     M
    0.55
    Act Density 0.223%

    No Known Activations