INDEX
    Explanations

    Reasons or explanations

    The neuron primarily activates on negative modal constructions (especially “can’t” or similar prohibitions) indicating that the user is unable or forbidden to do something.

    New Auto-Interp
    Negative Logits
    Orange
    -0.07
     آباد
    -0.07
    uParam
    -0.07
     силу
    -0.07
    syscall
    -0.07
     Trouble
    -0.07
    OutOfRangeException
    -0.07
    チュ
    -0.06
     creator
    -0.06
     آسی
    -0.06
    POSITIVE LOGITS
     Loans
    0.07
     conditional
    0.06
    (first
    0.06
    359
    0.06
    кам
    0.06
     cock
    0.06
    rowave
    0.06
    ็ต
    0.06
    enders
    0.06
    Α
    0.06
    Act Density 0.094%

    No Known Activations