INDEX
    Explanations

    This neuron doesn’t respond to any tokens—it never activates for any input.

    New Auto-Interp
    Negative Logits
     coffin
    -0.06
    _models
    -0.06
    ="\
    -0.06
     modifying
    -0.06
     сфері
    -0.06
     Terminator
    -0.06
     Merlin
    -0.06
    yum
    -0.06
     jerk
    -0.06
     розк
    -0.06
    POSITIVE LOGITS
    ingt
    0.07
     mentally
    0.07
     Appeal
    0.07
    _CITY
    0.07
     зобов
    0.07
     prior
    0.06
     свидетель
    0.06
    _was
    0.06
    onga
    0.06
     temp
    0.06
    Act Density 0.004%

    No Known Activations