INDEX
    Explanations

    The neuron is looking for modal qualifier adverbs—most strongly firing on words like “possible” or “necessary.”

    New Auto-Interp
    Negative Logits
    uters
    -0.07
    HTTPRequestOperation
    -0.06
     tragedies
    -0.06
    urpose
    -0.06
    fried
    -0.06
    .orange
    -0.06
    brief
    -0.06
    -graph
    -0.06
    rieben
    -0.06
     случай
    -0.06
    POSITIVE LOGITS
     Roma
    0.07
    0.07
     unb
    0.06
    еров
    0.06
     Lond
    0.06
     hava
    0.06
    ****↵
    0.06
     Rarity
    0.06
     syncing
    0.06
    exter
    0.06
    Act Density 0.022%

    No Known Activations