INDEX
    Explanations

    The neuron activates on occurrences of the word “Teen” (including parts of “Teenager”).

    New Auto-Interp
    Negative Logits
    Map
    -0.07
     Rut
    -0.07
     Zinc
    -0.07
     Liu
    -0.07
     Hugh
    -0.07
    map
    -0.07
     Bison
    -0.07
     Wit
    -0.07
     gold
    -0.06
    ith
    -0.06
    POSITIVE LOGITS
     teen
    0.11
     Teen
    0.10
     adolescent
    0.09
     teenage
    0.09
     teens
    0.08
    Teen
    0.08
     teenagers
    0.08
    teacher
    0.08
    adden
    0.07
     adolescents
    0.07
    Act Density 0.007%

    No Known Activations