INDEX
    Explanations

    This neuron detects occurrences of the word “sign” (and its plural “signs”) used to indicate an indicator or cue.

    New Auto-Interp
    Negative Logits
    _sold
    -0.08
    imd
    -0.07
    rights
    -0.07
     sortable
    -0.07
    Tên
    -0.06
     acc
    -0.06
    ستم
    -0.06
     Eduardo
    -0.06
    -aged
    -0.06
    wed
    -0.06
    POSITIVE LOGITS
    ostream
    0.07
    	db
    0.07
    (distance
    0.06
    _CUR
    0.06
     zurück
    0.06
    469
    0.06
    保证
    0.06
     vezes
    0.06
     ode
    0.06
    0.06
    Act Density 0.004%

    No Known Activations