INDEX
    Explanations

    The neuron detects the subword fragments of “zombie” (and closely related terms like “apocalypse”).

    New Auto-Interp
    Negative Logits
    Feed
    -0.07
     Unblock
    -0.07
    ADM
    -0.06
    PLAN
    -0.06
    Showing
    -0.06
    <Application
    -0.06
    งหมด
    -0.06
    658
    -0.06
    üst
    -0.06
    在线视频
    -0.06
    POSITIVE LOGITS
    0.08
     prev
    0.07
    hoot
    0.06
    anitize
    0.06
    \"",↵
    0.06
     parsley
    0.06
    тон
    0.06
    0.06
    _finalize
    0.06
     кишеч
    0.06
    Act Density 0.013%

    No Known Activations