INDEX
    Explanations

    The neuron fires on wordpiece tokens that mark the beginnings of multi‐syllable or less common words (often proper nouns or technical terms).

    New Auto-Interp
    Negative Logits
     komple
    -0.06
     verified
    -0.06
     CAN
    -0.06
    	array
    -0.06
     signing
    -0.06
    incare
    -0.06
     đảng
    -0.06
    -0.05
     grades
    -0.05
     گ
    -0.05
    POSITIVE LOGITS
    :Event
    0.07
     responders
    0.07
    _photos
    0.07
    \FrameworkBundle
    0.07
    _lex
    0.07
    Road
    0.06
    -capital
    0.06
    recio
    0.06
     iq
    0.06
    acs
    0.06
    Act Density 0.070%

    No Known Activations