INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Jefus
    -1.47
     myſelf
    -1.33
     ſche
    -1.30
     purpoſe
    -1.30
     pleaſure
    -1.29
     Majefty
    -1.27
     Eſ
    -1.27
     itſelf
    -1.26
     Theſe
    -1.25
     Monfieur
    -1.24
    POSITIVE LOGITS
     in
    0.62
     i
    0.60
    ,
    0.59
    <eos>
    0.59
     of
    0.56
     T
    0.56
     and
    0.56
    .
    0.55
     t
    0.54
     to
    0.54
    Act Density 0.078%

    No Known Activations