INDEX
    Explanations

    references to specific codes or categories

    New Auto-Interp
    Negative Logits
     Theſe
    -1.01
     ་་
    -0.99
     Beſ
    -0.96
     ſeveral
    -0.91
     Diſ
    -0.90
     myſelf
    -0.89
     raiſ
    -0.86
     Anſ
    -0.85
     ―――――
    -0.85
     faſt
    -0.84
    POSITIVE LOGITS
     R
    1.84
    R
    1.72
     r
    1.53
    getR
    1.41
    r
    1.22
     M
    1.09
     L
    1.09
    आर
    1.07
     P
    1.04
     S
    1.02
    Act Density 0.176%

    No Known Activations