INDEX
    Explanations

    The neuron activates on occurrences of the word “sentence” (and its inflected forms) in the text.

    New Auto-Interp
    Negative Logits
    _get
    -0.07
    Mag
    -0.07
     stud
    -0.06
     Coalition
    -0.06
    14
    -0.06
     air
    -0.06
     coalition
    -0.06
     anarchist
    -0.06
    .Get
    -0.06
     expo
    -0.06
    POSITIVE LOGITS
     sentencing
    0.08
    0.08
     sentence
    0.08
     Sentence
    0.07
     sentenced
    0.07
    _serializer
    0.07
    )set
    0.07
     tint
    0.07
    ених
    0.07
    etur
    0.07
    Act Density 0.005%

    No Known Activations