INDEX
    Explanations

    This neuron detects the standalone word “up,” as in casual greetings like “what’s up.”

    New Auto-Interp
    Negative Logits
     Numero
    -0.07
    شناس
    -0.06
     μό
    -0.06
     trous
    -0.06
    @Table
    -0.06
    님이
    -0.06
     YES
    -0.06
    (Cs
    -0.06
    .US
    -0.06
     ubyt
    -0.06
    POSITIVE LOGITS
    LANGADM
    0.07
     toán
    0.07
     constituents
    0.07
     Dual
    0.07
    _cell
    0.06
    getWidth
    0.06
    -nine
    0.06
    -separated
    0.06
    553
    0.06
     udp
    0.06
    Act Density 0.011%

    No Known Activations