INDEX
    Explanations

    The main thing this neuron does is detect mentions of movie theaters or cinema‐related terms (e.g., “cinema,” “theater,” movie‐theater chain names).

    New Auto-Interp
    Negative Logits
    -max
    -0.07
    904
    -0.07
     Dreams
    -0.06
     drink
    -0.06
     Samurai
    -0.06
     strongly
    -0.06
    ่ว
    -0.06
    isChecked
    -0.06
     Flame
    -0.06
     infield
    -0.06
    POSITIVE LOGITS
     &↵
    0.07
    getattr
    0.07
    .axes
    0.06
    0.06
    าศ
    0.06
    ослав
    0.06
    アル
    0.06
     Ngb
    0.06
    ...",↵
    0.06
    ]?.
    0.06
    Act Density 0.020%

    No Known Activations