INDEX
    Explanations

    numbers with non-standard zeros

    New Auto-Interp
    Negative Logits
    0.52
     mes
    0.51
     dumb
    0.49
    ază
    0.49
     teal
    0.49
     inescap
    0.47
     wavefunction
    0.47
    sembling
    0.46
     lenient
    0.46
    mäss
    0.46
    POSITIVE LOGITS
    0
    0.97
    9
    0.73
    ۰
    0.70
    8
    0.70
    0.68
    7
    0.67
    0.67
    4
    0.65
    0.65
    6
    0.64
    Act Density 0.412%

    No Known Activations