INDEX
    Explanations

    TV show reviews

    The neuron detects named entities (especially capitalized proper names and TV‐show titles).

    New Auto-Interp
    Negative Logits
     xmin
    -0.07
    	param
    -0.07
     Carly
    -0.07
     vtk
    -0.06
    ัต
    -0.06
     preschool
    -0.06
     Vietnam
    -0.06
    .broadcast
    -0.06
     stud
    -0.06
     pok
    -0.06
    POSITIVE LOGITS
     offending
    0.07
    \"]
    0.07
    ContentAlignment
    0.06
    yms
    0.06
     countered
    0.06
     taxing
    0.06
     prisoner
    0.06
    ениями
    0.06
    0.06
    }\"
    0.06
    Act Density 0.022%

    No Known Activations