INDEX
    Explanations

    The neuron activates specifically on the word “text.”

    New Auto-Interp
    Negative Logits
     Ге
    -0.07
    -0.07
    (X
    -0.06
     οποίο
    -0.06
    uably
    -0.06
     Іван
    -0.06
    _zoom
    -0.06
     Nguyễn
    -0.06
    elper
    -0.06
    ancybox
    -0.06
    POSITIVE LOGITS
    IMITER
    0.06
     winning
    0.06
    icontains
    0.06
     distant
    0.06
    /Subthreshold
    0.06
     Hoff
    0.06
    _TRIANGLES
    0.06
     입니다
    0.06
    !)↵↵
    0.06
    Previous
    0.06
    Act Density 0.007%

    No Known Activations