INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     purpoſe
    -1.16
     myſelf
    -1.09
     Efq
    -1.09
     greateſt
    -1.09
     houſe
    -1.06
     itſelf
    -1.06
     Jefus
    -1.03
     ſeveral
    -1.02
    expandindo
    -1.02
     Reſ
    -0.99
    POSITIVE LOGITS
     to
    1.19
     To
    0.63
     in
    0.61
     on
    0.57
     La
    0.57
    to
    0.55
     de
    0.55
     la
    0.55
     De
    0.55
     into
    0.55
    Act Density 0.129%

    No Known Activations