INDEX
    Explanations

    The neuron is detecting the sequence “the” (i.e. the letters t-h-e, whether as the standalone article or embedded inside other words).

    New Auto-Interp
    Negative Logits
     Cust
    -0.06
    -0.06
    ssize
    -0.06
     urging
    -0.06
     σκο
    -0.06
    petto
    -0.06
    $/,↵
    -0.06
     occurrence
    -0.06
     Law
    -0.06
    formed
    -0.06
    POSITIVE LOGITS
     směrem
    0.07
    limited
    0.07
    -|
    0.07
     returnType
    0.07
     없었다
    0.06
    alesce
    0.06
    lıyor
    0.06
     inadvert
    0.06
    0.06
     guten
    0.06
    Act Density 0.057%

    No Known Activations