INDEX
    Explanations

    The neuron activates on numeric tokens—especially floating‐point numbers.

    New Auto-Interp
    Negative Logits
     righteousness
    -0.06
    ~~~~~~~~~~~~~~~~
    -0.06
     SIDE
    -0.06
    .SQL
    -0.06
    conference
    -0.06
     Bers
    -0.06
    (By
    -0.06
     --------------------------------
    -0.06
    Warn
    -0.06
     Virgin
    -0.06
    POSITIVE LOGITS
    0.08
     escort
    0.07
     diffic
    0.07
     Jungle
    0.06
     kappa
    0.06
    -value
    0.06
    secret
    0.06
     MISSING
    0.06
    -launch
    0.06
    osh
    0.06
    Act Density 0.008%

    No Known Activations