INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Hello
    -0.60
     in
    -0.59
     on
    -0.59
    Hey
    -0.55
     of
    -0.53
    Dear
    -0.51
     at
    -0.50
     or
    -0.48
     del
    -0.48
    Alright
    -0.47
    POSITIVE LOGITS
     Efq
    0.89
     houſe
    0.88
     myſelf
    0.88
     Majefty
    0.88
     Houſe
    0.87
     Monfieur
    0.86
     Mahomet
    0.85
     Shakspeare
    0.84
     itſelf
    0.84
     pleaſure
    0.83
    Act Density 0.123%

    No Known Activations