INDEX
    Explanations

    math equations

    The neuron selectively activates on numeric literal tokens (especially floating‐point numbers).

    New Auto-Interp
    Negative Logits
    +a
    -0.06
    Grace
    -0.06
     Kane
    -0.06
    ctrl
    -0.06
     admire
    -0.06
    ela
    -0.06
     fila
    -0.06
    Social
    -0.06
    Pattern
    -0.06
    	task
    -0.06
    POSITIVE LOGITS
    0.06
     rover
    0.06
    urr
    0.06
    .Free
    0.06
    -stack
    0.06
     Hamas
    0.06
    grams
    0.06
    ONS
    0.06
     explained
    0.06
     seizures
    0.06
    Act Density 0.004%

    No Known Activations