INDEX
    Explanations

    non-English texts

    The neuron detects Japanese-language tokens (i.e. Kanji/Kana text).

    New Auto-Interp
    Negative Logits
    ,他
    -0.06
     Fuck
    -0.06
     Bulls
    -0.06
    razier
    -0.06
    .Application
    -0.06
    --;
    ↵
    -0.06
     Larry
    -0.06
     CC
    -0.06
    -0.06
    .SelectedIndexChanged
    -0.06
    POSITIVE LOGITS
     Bundes
    0.07
    898
    0.07
    0.07
    0.07
     dışında
    0.06
     Armed
    0.06
    0.06
     церков
    0.06
    ので
    0.06
    0.06
    Act Density 0.166%

    No Known Activations