INDEX
    Explanations

    This neuron activates on the phrase “presence of.”

    New Auto-Interp
    Negative Logits
    excel
    -0.08
     unlucky
    -0.07
     stol
    -0.07
    ominated
    -0.07
    _label
    -0.07
    Exercise
    -0.07
    Experimental
    -0.07
     accelerated
    -0.07
    Split
    -0.07
    _EXIT
    -0.07
    POSITIVE LOGITS
     presence
    0.14
    presence
    0.11
     Presence
    0.10
     друж
    0.07
     absence
    0.07
     Pen
    0.07
    ence
    0.07
     نگهد
    0.07
     moisture
    0.07
    entar
    0.07
    Act Density 0.015%

    No Known Activations