INDEX
    Explanations

    Common English words

    This neuron lights up on tokens that belong to titles or section headings (e.g., names of films, books, articles).

    New Auto-Interp
    Negative Logits
    csv
    -0.07
    twenty
    -0.07
    dued
    -0.07
    -0.07
     talent
    -0.07
     hypertension
    -0.06
    те
    -0.06
     viewers
    -0.06
    $/)
    -0.06
    -0.06
    POSITIVE LOGITS
     قاب
    0.07
    riteln
    0.06
    ürlich
    0.06
     LOD
    0.06
     companyId
    0.06
     Autos
    0.06
     redefine
    0.06
    izontally
    0.06
     mír
    0.06
     Experience
    0.06
    Act Density 0.065%

    No Known Activations