INDEX
    Explanations

    numeric values in a specific format or pattern

    New Auto-Interp
    Negative Logits
     Theſe
    -1.04
     myſelf
    -0.97
     themſelves
    -0.94
     doubtnut
    -0.92
     ་་
    -0.91
     Anſ
    -0.91
    wiſe
    -0.91
     ſeveral
    -0.90
     raiſ
    -0.90
     Diſ
    -0.88
    POSITIVE LOGITS
     l
    1.73
     L
    1.68
    L
    1.54
    getL
    1.39
     r
    1.10
    l
    1.08
     s
    1.05
    1.03
     t
    1.01
     d
    0.99
    Act Density 0.129%

    No Known Activations