INDEX
    Explanations

    numbers followed by zero

    New Auto-Interp
    Negative Logits
    0.84
     sixteen
    0.84
     seventeen
    0.81
     seven
    0.81
     nineteen
    0.80
     Eight
    0.80
     thirteen
    0.79
     eleven
    0.78
     Fourteen
    0.78
     Sixteen
    0.77
    POSITIVE LOGITS
    0
    1.53
    ۰
    1.35
    1.32
    1.28
    1.27
    1.22
    ٠
    1.19
    𝟬
    1.12
    1.09
    1.09
    Act Density 0.657%

    No Known Activations