INDEX
    Explanations

    The neuron fires on words indicating material damage—specifically tears, punctures, or being torn/ripped.

    New Auto-Interp
    Negative Logits
     설치
    -0.06
     cycle
    -0.06
     controlled
    -0.06
     align
    -0.06
    _META
    -0.06
     Insert
    -0.06
     doubled
    -0.06
     vaccination
    -0.06
     Lua
    -0.06
     <<
    -0.06
    POSITIVE LOGITS
    ADO
    0.07
     Pemb
    0.07
     písem
    0.07
    کری
    0.07
     восп
    0.06
     spolu
    0.06
    εργ
    0.06
    申请
    0.06
    گي
    0.06
     вок
    0.06
    Act Density 0.024%

    No Known Activations