INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    )-\
    0.46
     interruptions
    0.43
    )}{\
    0.43
    )+\
    0.42
    )^{\
    0.40
    )=-\
    0.39
     sprinkles
    0.39
     scrubs
    0.39
    َال
    0.39
    :=\
    0.38
    POSITIVE LOGITS
    *
    0.93
     *
    0.67
    *>(
    0.64
    *,
    0.60
    *;
    0.54
    *'
    0.52
    *\*
    0.52
    *((*
    0.52
    *',
    0.51
    \*
    0.50
    Act Density 0.003%

    No Known Activations