INDEX
    Explanations

    The neuron activates on in‐text citation or figure/table reference markers (e.g. “ref-type=” or numbered refs).

    New Auto-Interp
    Negative Logits
    steps
    -0.07
    ่ร
    -0.07
    센터
    -0.06
    .per
    -0.06
    다면
    -0.06
    -0.06
    untary
    -0.06
    .cont
    -0.06
    с
    -0.06
     ",");↵
    -0.06
    POSITIVE LOGITS
     DataType
    0.07
    rende
    0.07
    (Frame
    0.07
     vaccines
    0.07
     chez
    0.06
     induction
    0.06
     khí
    0.06
     flaw
    0.06
    (runtime
    0.06
    Col
    0.06
    Act Density 0.001%

    No Known Activations