INDEX
    Explanations

    This neuron specifically fires on occurrences of the standalone word “Spell.”

    New Auto-Interp
    Negative Logits
    ac
    -0.08
     Jacobs
    -0.07
     bicy
    -0.07
    _nc
    -0.07
     torso
    -0.07
    cou
    -0.07
     국내
    -0.07
    Kim
    -0.07
     MAK
    -0.07
     Davis
    -0.07
    POSITIVE LOGITS
     Spell
    0.13
     spell
    0.12
    Spell
    0.10
    pell
    0.09
    spell
    0.08
     spells
    0.08
     spelling
    0.08
    SPELL
    0.08
    SSL
    0.08
    0.07
    Act Density 0.005%

    No Known Activations