INDEX
    Explanations

    mathematical expressions involving derivatives, derivatives, or calculus notation

    New Auto-Interp
    Negative Logits
    ^(@)
    -2.72
     itſelf
    -2.58
     Efq
    -2.45
     myſelf
    -2.42
     Мексичка
    -2.41
    NUMX
    -2.41
     $_"
    -2.34
     Forumite
    -2.27
    ſelves
    -2.23
     ―――――
    -2.20
    POSITIVE LOGITS
    .
    1.90
    ↵↵
    1.73
    1.70
    -
    1.69
    <eos>
    1.63
    (
    1.63
    /
    1.55
    ,
    1.54
     (
    1.45
    :
    1.44
    Act Density 0.047%

    No Known Activations