INDEX
    Explanations

    This neuron detects language about incentives and motivation—terms describing performance rewards, cooperation versus competition, and motivating actions.

    New Auto-Interp
    Negative Logits
     strokeWidth
    -0.07
     hintText
    -0.07
     Churches
    -0.06
    (Window
    -0.06
    TEXT
    -0.06
     trở
    -0.06
     سو
    -0.06
     شدن
    -0.06
    .keyboard
    -0.06
    728
    -0.06
    POSITIVE LOGITS
    jpeg
    0.07
    learning
    0.06
    reau
    0.06
     Http
    0.06
    rgb
    0.06
     trách
    0.06
    	auth
    0.06
     Answer
    0.06
    rade
    0.06
    Extra
    0.06
    Act Density 0.078%

    No Known Activations