INDEX
    Explanations

    references to individuals and their affiliations

    This neuron activates strongly on proper nouns and place names, particularly those that appear in citations, acknowledgments, and author attributions.

    New Auto-Interp
    Negative Logits
     שוליים
    -0.67
     snippetHide
    -0.67
     GenerationType
    -0.63
    MessageOf
    -0.63
    enderror
    -0.60
     EconPapers
    -0.60
    exitRule
    -0.58
    addCriterion
    -0.57
     ब्रेकडाउन
    -0.57
    verwijspagina
    -0.55
    POSITIVE LOGITS
    opinion
    0.37
     Strö
    0.35
     opin
    0.35
    Mind
    0.35
     belo
    0.34
    %^
    0.34
    ele
    0.33
     Mind
    0.33
     Todes
    0.33
     gros
    0.32
    Act Density 0.312%

    No Known Activations