INDEX
    Explanations

    The neuron activates on word‐initial “mon(o)-” fragments (e.g. monogenic, monocular, monolithic).

    New Auto-Interp
    Negative Logits
    eff
    -0.08
    !I
    -0.07
    ีเอ
    -0.07
     Vital
    -0.07
     Wit
    -0.07
     энерг
    -0.07
    fang
    -0.07
     eff
    -0.07
     Eff
    -0.07
    -0.07
    POSITIVE LOGITS
     Mon
    0.14
     mon
    0.13
    Mon
    0.13
    mon
    0.12
     Monica
    0.10
    .Mon
    0.09
    _mon
    0.08
    MON
    0.08
    -mon
    0.08
     monastery
    0.08
    Act Density 0.033%

    No Known Activations