INDEX
    Explanations

    The neuron activates on occurrences of the word “film” (and closely related context, e.g. “films”), effectively detecting mentions of movies.

    New Auto-Interp
    Negative Logits
     tráv
    -0.06
     [~,
    -0.06
    している
    -0.06
     Imports
    -0.06
     Meredith
    -0.06
    -data
    -0.06
    -categories
    -0.06
    第四
    -0.06
    ע
    -0.06
     střed
    -0.06
    POSITIVE LOGITS
     LAS
    0.07
     multiply
    0.07
    .tx
    0.07
     cro
    0.07
     accumulate
    0.07
     гар
    0.06
     Ey
    0.06
    ju
    0.06
     sneak
    0.06
     immortal
    0.06
    Act Density 0.038%

    No Known Activations