INDEX
    Explanations

    The neuron activates on phrases referring to website “terms of service” or permissions/legal restrictions.

    New Auto-Interp
    Negative Logits
    адження
    -0.07
    German
    -0.07
    арі
    -0.07
    WXYZ
    -0.06
    LLLL
    -0.06
     revise
    -0.06
    250
    -0.06
    -0.06
     Simon
    -0.06
     Knights
    -0.06
    POSITIVE LOGITS
    -init
    0.07
    ляти
    0.07
    ocks
    0.07
    jal
    0.06
     sizing
    0.06
     SYS
    0.06
     i
    0.06
     postav
    0.06
    ('');↵
    0.06
     Ül
    0.06
    Act Density 0.012%

    No Known Activations