INDEX
    Explanations

    This neuron activates on the literal word “Passage.”

    New Auto-Interp
    Negative Logits
    .len
    -0.07
    /index
    -0.07
     Min
    -0.07
    345
    -0.07
    ;\
    -0.06
     mph
    -0.06
     Client
    -0.06
    -0.06
    1
    -0.06
    %
    -0.06
    POSITIVE LOGITS
     passage
    0.15
     passages
    0.13
     Passage
    0.12
     doprov
    0.10
    uge
    0.08
    age
    0.08
    anse
    0.08
    аче
    0.08
    ще
    0.08
    cce
    0.07
    Act Density 0.006%

    No Known Activations