INDEX
    Explanations

    mathematical symbols and code placeholders

    New Auto-Interp
    Negative Logits
     B
    0.98
     A
    0.97
     C
    0.97
    A
    0.89
     F
    0.87
    B
    0.86
     Y
    0.81
    X
    0.81
     X
    0.81
     P
    0.76
    POSITIVE LOGITS
     m
    1.84
     d
    1.55
     r
    1.52
     v
    1.52
     q
    1.51
     n
    1.51
     g
    1.48
     s
    1.47
     h
    1.47
     p
    1.44
    Act Density 0.685%

    No Known Activations