INDEX
    Explanations

    computer layers

    This neuron never activates—it does not respond to any token.

    New Auto-Interp
    Negative Logits
    urse
    -0.07
    ylko
    -0.07
    uação
    -0.07
    urses
    -0.06
    afi
    -0.06
     degrade
    -0.06
    уст
    -0.06
    τά
    -0.06
     ruins
    -0.06
    曜日
    -0.06
    POSITIVE LOGITS
    Sections
    0.06
     ση
    0.06
     Όμιλος
    0.06
     speeding
    0.06
    0.06
    _US
    0.06
    PositiveButton
    0.06
     참여
    0.06
    0.06
     ceasefire
    0.05
    Act Density 0.021%

    No Known Activations