INDEX
    Explanations

    beginning markers in written content

    New Auto-Interp
    Negative Logits
     de
    -0.98
     del
    -0.95
     sa
    -0.92
     A
    -0.91
     int
    -0.90
     is
    -0.90
     in
    -0.89
     et
    -0.88
     I
    -0.87
     y
    -0.86
    POSITIVE LOGITS
     itſelf
    1.63
     doubtnut
    1.51
     myſelf
    1.51
     pleaſure
    1.45
     Anſ
    1.40
     uſed
    1.38
     ་་
    1.35
     unſ
    1.34
    ſelf
    1.33
     poffible
    1.33
    Act Density 0.151%

    No Known Activations