INDEX
    Explanations

    plus/minus signs

    This neuron detects simple parenthesized arithmetic subexpressions of the form “(<single‐letter variable> + <number>)”.

    New Auto-Interp
    Negative Logits
    Password
    -0.07
    credential
    -0.07
    IDENT
    -0.07
    "strconv
    -0.06
    ident
    -0.06
     Largest
    -0.06
    -0.06
    offline
    -0.06
    ermen
    -0.06
    顔を
    -0.06
    POSITIVE LOGITS
     случаях
    0.07
     نسبة
    0.07
     meant
    0.06
     yapıldı
    0.06
    -bl
    0.06
     ам
    0.06
     ekip
    0.06
     ang
    0.06
    avě
    0.06
     Cult
    0.06
    Act Density 0.004%

    No Known Activations