INDEX
    Explanations

    The neuron activates on words and phrases that indicate the code is working correctly—e.g. “works,” “fine,” “perfectly,” “working,” etc.

    New Auto-Interp
    Negative Logits
     необходимости
    -0.06
     استاندارد
    -0.06
     механіз
    -0.06
     подум
    -0.06
     دارم
    -0.06
     вспом
    -0.06
    孩子
    -0.06
     CONSTANT
    -0.06
    -0.06
    berra
    -0.06
    POSITIVE LOGITS
    -CN
    0.07
     Claude
    0.07
     Incontri
    0.07
    _basic
    0.07
    lında
    0.06
     fill
    0.06
     fills
    0.06
     Range
    0.06
    stdint
    0.06
    .HttpSession
    0.06
    Act Density 0.008%

    No Known Activations