INDEX
    Explanations

    expressions or phrases related to language comprehension and communication.

    The neuron strongly activates on occurrences of the modal verb “can” (and variants expressing ability), i.e. statements of what the assistant is able to do.

    New Auto-Interp
    Negative Logits
     advantages
    -0.07
     accommodations
    -0.07
     analyze
    -0.07
    observer
    -0.07
     vyu
    -0.07
     belongings
    -0.06
    Nach
    -0.06
    Bo
    -0.06
    perator
    -0.06
    izace
    -0.06
    POSITIVE LOGITS
    *****↵↵
    0.06
    ][-
    0.06
    0.06
     Colum
    0.06
     могу
    0.06
    _MAXIMUM
    0.06
     Güney
    0.06
     gồm
    0.06
    으면
    0.06
     yoktu
    0.06
    Act Density 0.039%

    No Known Activations