INDEX
    Explanations

    Words ending in certain suffixes

    The neuron fires on Cyrillic‐script words, effectively detecting Russian‐language text.

    New Auto-Interp
    Negative Logits
    i
    -0.09
    ke
    -0.09
    E
    -0.09
    ane
    -0.09
    CE
    -0.08
    me
    -0.08
    э
    -0.08
    e
    -0.08
    FE
    -0.08
    I
    -0.08
    POSITIVE LOGITS
    on
    0.16
    ON
    0.14
    or
    0.12
    son
    0.11
    don
    0.11
     don
    0.11
    DON
    0.11
    ton
    0.11
    SON
    0.10
    elon
    0.10
    Act Density 0.305%

    No Known Activations