INDEX
    Explanations

    This neuron activates on mentions of people going missing or disappearing (e.g., “missing,” “disappearance,” “went missing”).

    New Auto-Interp
    Negative Logits
     jenom
    -0.06
     mocker
    -0.06
     tạp
    -0.06
     '?
    -0.06
     chiefs
    -0.06
     ITS
    -0.06
     маль
    -0.06
     профессиональ
    -0.06
    onces
    -0.06
     Lamb
    -0.06
    POSITIVE LOGITS
     disappeared
    0.09
    (steps
    0.07
     Scalar
    0.07
    phil
    0.07
    locking
    0.07
     vanished
    0.07
    -bl
    0.06
    0.06
    แจ
    0.06
    0.06
    Act Density 0.018%

    No Known Activations