INDEX
    Explanations

    mathematical expressions and symbols

    mathematical expressions or symbols indicating addition

    New Auto-Interp
    Negative Logits
     Nare
    -0.69
     Sparkle
    -0.68
    ruary
    -0.64
     Dise
    -0.64
    opathic
    -0.64
    pace
    -0.63
    Runner
    -0.63
     Grimes
    -0.63
     Turing
    -0.62
     Petty
    -0.61
    POSITIVE LOGITS
    ++++++++++++++++
    1.36
    -+-+-+-+
    1.30
    ++++
    1.11
    /+
    1.06
    ++++++++
    0.99
    --+
    0.95
    -+
    0.95
    /-
    0.90
    --------------------------------
    0.78
    union
    0.77
    Act Density 0.021%

    No Known Activations