INDEX
    Explanations

    The neuron strongly activates on verbs that offer assistance or support (e.g. “help,” “assist”).

    New Auto-Interp
    Negative Logits
     instit
    -0.06
     zákaz
    -0.06
     ric
    -0.06
    quee
    -0.06
    _records
    -0.06
    ignon
    -0.06
    üne
    -0.06
    -0.06
     ora
    -0.06
     nim
    -0.06
    POSITIVE LOGITS
     help
    0.15
     helps
    0.13
     helped
    0.13
     helping
    0.13
     Help
    0.12
    help
    0.12
     Helping
    0.11
     helpful
    0.11
     Helps
    0.11
    Help
    0.10
    Act Density 0.133%

    No Known Activations