INDEX
    Explanations

    This neuron activates on the names of TV shows (proper‐noun series titles) mentioned in the text.

    New Auto-Interp
    Negative Logits
    紹介
    -0.07
     Když
    -0.07
     prosince
    -0.07
     zoals
    -0.06
     drops
    -0.06
     října
    -0.06
    ная
    -0.06
     xbox
    -0.06
    _g
    -0.06
     toy
    -0.06
    POSITIVE LOGITS
     Howell
    0.06
    					    
    0.06
     aaa
    0.06
     Costa
    0.06
    .MSG
    0.06
    Province
    0.06
     condos
    0.06
    CREATE
    0.06
     FILES
    0.06
    addresses
    0.06
    Act Density 0.003%

    No Known Activations