INDEX
    Explanations

    These activations suggest that the neuron is looking for names of specific places, people, and entities

    proper nouns or names associated with people and places

    New Auto-Interp
    Negative Logits
     defect
    -0.70
    ufact
    -0.69
    ucl
    -0.67
    shire
    -0.64
    etheus
    -0.57
     imperative
    -0.56
     narrated
    -0.56
    ensical
    -0.55
    arcity
    -0.54
     cour
    -0.54
    POSITIVE LOGITS
    hiba
    0.80
    wagen
    0.79
    agi
    0.74
    Mods
    0.73
    scl
    0.70
    hei
    0.69
    kat
    0.68
    hesda
    0.64
    apons
    0.63
    Container
    0.62
    Act Density 1.157%

    No Known Activations