INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     No
    -1.45
    No
    -1.21
    NO
    -0.62
     tap
    -0.60
     NO
    -0.55
     Nos
    -0.55
     №
    -0.52
     the
    -0.50
    Nos
    -0.46
    s
    -0.46
    POSITIVE LOGITS
     purpoſe
    0.96
     ſmall
    0.90
     Houſe
    0.88
     Diſ
    0.87
     perſon
    0.87
     reaſon
    0.85
     ſtate
    0.85
     myſelf
    0.85
     Anſ
    0.84
     greateſt
    0.84
    Act Density 0.093%

    No Known Activations