INDEX
    Explanations

    This neuron fires on longer tokens (especially multi-syllable words), essentially acting as a “long-word” detector.

    New Auto-Interp
    Negative Logits
     lots
    -0.08
    .Line
    -0.06
     Xt
    -0.06
    Bed
    -0.06
     اینترنتی
    -0.06
     Coc
    -0.06
    Hor
    -0.06
     nhỏ
    -0.06
     Svg
    -0.06
    -helper
    -0.06
    POSITIVE LOGITS
    0.06
    monds
    0.06
     residues
    0.06
    0.06
    _placeholder
    0.06
    indices
    0.06
     quelle
    0.05
    кам
    0.05
    0.05
     vše
    0.05
    Act Density 0.214%

    No Known Activations