INDEX
    Explanations

    This neuron fires on named entities—especially film titles, actor/character names, and other proper nouns.

    New Auto-Interp
    Negative Logits
    _pw
    -0.06
    -0.06
    -0.06
    _VISIBLE
    -0.06
    Store
    -0.06
     damages
    -0.06
    .Priority
    -0.06
     nuclear
    -0.06
     등록
    -0.06
     medio
    -0.06
    POSITIVE LOGITS
     situation
    0.06
     legally
    0.06
    books
    0.06
     parad
    0.06
    0.06
    ινή
    0.06
    ským
    0.06
    >alert
    0.06
     BEEN
    0.06
    ανδ
    0.06
    Act Density 0.015%

    No Known Activations