INDEX
    Explanations

    Speaking out/expressing opinions

    The neuron activates on tokens in phrases like “spoke out” or “spoken out,” i.e. when the text describes someone publicly speaking up or criticizing.

    New Auto-Interp
    Negative Logits
     εφαρ
    -0.06
    	Task
    -0.06
    _ASS
    -0.06
    费用
    -0.06
    _gshared
    -0.06
    ]['
    -0.06
     snprintf
    -0.06
    Gro
    -0.06
    _META
    -0.06
    _Generic
    -0.06
    POSITIVE LOGITS
     outspoken
    0.11
    .twimg
    0.07
     λίγ
    0.07
    .xrLabel
    0.07
    0.07
     responsive
    0.06
     Rotterdam
    0.06
     u
    0.06
    0.06
     Fisher
    0.06
    Act Density 0.006%

    No Known Activations