INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     myſelf
    -1.28
     itſelf
    -1.22
     raiſ
    -1.18
     Monfieur
    -1.14
     uſed
    -1.12
     purpoſe
    -1.11
     ſtate
    -1.05
     houſe
    -1.04
     himſelf
    -1.02
     poffible
    -1.02
    POSITIVE LOGITS
     (
    0.63
     L
    0.60
     H
    0.58
    ,
    0.57
     and
    0.57
    0.57
     B
    0.56
     the
    0.55
     an
    0.55
     in
    0.54
    Act Density 0.570%

    No Known Activations