INDEX
    Explanations

    capital letters followed by a single digit

    occurrences of the placeholder token "

    New Auto-Interp
    Negative Logits
     wip
    -0.80
    ãĥ¼ãĥĨãĤ£
    -0.79
    ãĤ¼ãĤ¦ãĤ¹
    -0.73
    yards
    -0.71
     substitutes
    -0.70
     CoC
    -0.69
     à¨
    -0.69
     sinks
    -0.69
     unfocusedRange
    -0.67
     Dickinson
    -0.67
    POSITIVE LOGITS
    ceans
    1.24
    lymp
    1.23
    culus
    1.19
    vernight
    1.17
    tto
    1.17
    BS
    1.08
    rient
    1.07
    scill
    1.06
    zzy
    1.05
    oops
    1.05
    Act Density 0.023%

    No Known Activations