INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Efq
    -1.33
     Jefus
    -1.22
     ſtate
    -1.20
     itſelf
    -1.20
     Monfieur
    -1.20
     pleaſure
    -1.14
     faſt
    -1.12
     himſelf
    -1.11
     Majefty
    -1.11
     myſelf
    -1.10
    POSITIVE LOGITS
    0.73
    ,
    0.71
    .
    0.68
     “
    0.65
     (
    0.62
    0.62
     for
    0.61
     I
    0.58
     at
    0.57
     by
    0.57
    Act Density 0.058%

    No Known Activations