INDEX
    Explanations

    This neuron activates on words that signal formal mathematical rigor or exactness (e.g. “exact,” “rigorous,” “certified”).

    New Auto-Interp
    Negative Logits
    ("\"
    -0.07
    _black
    -0.07
     mua
    -0.07
    .Permission
    -0.06
    munition
    -0.06
    iece
    -0.06
     summaries
    -0.06
    	test
    -0.06
    zero
    -0.06
    sin
    -0.06
    POSITIVE LOGITS
    iếng
    0.07
     Salvador
    0.07
     บาง
    0.06
    aptor
    0.06
     Hull
    0.06
    amm
    0.06
    &,
    0.06
     ισχ
    0.06
    ΩΝ
    0.06
     EIF
    0.06
    Act Density 0.032%

    No Known Activations