INDEX
    Explanations

    The neuron activates on in-text citation markers (e.g. numbered or bracketed reference/footnote indicators).

    New Auto-Interp
    Negative Logits
    medi
    -0.08
    gett
    -0.07
    ницт
    -0.07
    Mi
    -0.07
    roke
    -0.07
    psych
    -0.07
    maint
    -0.07
    mi
    -0.07
    نت
    -0.06
    -0.06
    POSITIVE LOGITS
    "]==
    0.07
     ]]
    0.07
     Lol
    0.07
     باشند
    0.07
    !*
    0.07
     Tear
    0.06
    ule
    0.06
    		    		
    0.06
     ตำ
    0.06
     Torrent
    0.06
    Act Density 0.009%

    No Known Activations