INDEX
    Explanations

    Analysis/Observation

    This neuron fires on analytical framing words in hypothetical or conditional expressions (e.g. “we look,” “one takes,” “we consider”) that introduce examples or evidence.

    New Auto-Interp
    Negative Logits
     твор
    -0.08
     yaşında
    -0.07
    866
    -0.06
    ским
    -0.06
     Dick
    -0.06
    מ
    -0.06
    าคา
    -0.06
    kova
    -0.06
    rette
    -0.06
     smart
    -0.06
    POSITIVE LOGITS
     regarded
    0.07
    (se
    0.06
     doğrult
    0.06
    /connect
    0.06
    	tr
    0.06
    ="-
    0.06
    	image
    0.06
    ­i
    0.06
    illustr
    0.06
    >v
    0.06
    Act Density 0.032%

    No Known Activations