INDEX
    Explanations

    The neuron activates on mentions of writing or speaking English fluently (e.g. “English fluently”).

    New Auto-Interp
    Negative Logits
    stin
    -0.07
    stra
    -0.07
     Memories
    -0.06
    ا�
    -0.06
     ا
    -0.06
    argo
    -0.06
    .routing
    -0.06
     Sara
    -0.06
    Et
    -0.06
     diffé
    -0.06
    POSITIVE LOGITS
    	control
    0.07
    .Generated
    0.06
    BBBB
    0.06
    ($__
    0.06
     zůst
    0.06
    (QStringLiteral
    0.06
    0.06
    Zh
    0.06
    ODE
    0.06
     THIRD
    0.06
    Act Density 0.014%

    No Known Activations