INDEX
    Explanations

    The neuron selectively fires on occurrences of the word “motivation” (and its subword variants).

    New Auto-Interp
    Negative Logits
     creek
    -0.07
     black
    -0.07
     Ernst
    -0.07
     blocked
    -0.06
     zoo
    -0.06
    -0.06
     lạnh
    -0.06
     scraping
    -0.06
    -0.06
     crisp
    -0.06
    POSITIVE LOGITS
     motivated
    0.11
     motivation
    0.11
     motivate
    0.10
     Mot
    0.10
    mot
    0.10
     motivational
    0.10
     motiv
    0.10
    Mot
    0.10
     motivating
    0.09
     motivations
    0.08
    Act Density 0.012%

    No Known Activations