INDEX
    Explanations

    research papers

    This neuron responds to words that name processes or actions—especially nominalizations like “activities,” “allocation,” “identifying,” “changes,” and similar terms.

    New Auto-Interp
    Negative Logits
     tạp
    -0.06
    ů
    -0.06
    (Unknown
    -0.06
    明白
    -0.06
     uniqu
    -0.06
    xml
    -0.06
     interle
    -0.06
     Pool
    -0.06
    561
    -0.06
     Fraud
    -0.06
    POSITIVE LOGITS
     Οι
    0.07
     الجديد
    0.07
     боль
    0.06
    reira
    0.06
     Acts
    0.06
     patiently
    0.06
    ope
    0.06
    tri
    0.06
    insics
    0.06
     istedi
    0.06
    Act Density 0.122%

    No Known Activations