INDEX
    Explanations

    The neuron primarily activates on occurrences of the word “binary.”

    New Auto-Interp
    Negative Logits
     San
    -0.07
    -su
    -0.07
    hatt
    -0.06
    _schedule
    -0.06
     алког
    -0.06
     Xt
    -0.06
    -ste
    -0.06
     Nou
    -0.06
     pharmaceutical
    -0.06
     Allah
    -0.06
    POSITIVE LOGITS
     Density
    0.07
    BSD
    0.07
    0.06
    bsd
    0.06
    (">
    0.06
     inexperienced
    0.06
    centers
    0.06
    0.06
     Percentage
    0.06
     ans
    0.06
    Act Density 0.007%

    No Known Activations