INDEX
    Explanations

    Riots and clashes

    This neuron detects occurrences of the word “blasphemy” (and its sub‐token parts) in the text.

    references to communal violence or riots, particularly involving Hindus and Muslims.

    New Auto-Interp
    Negative Logits
     الثاني
    -0.07
     Міністер
    -0.07
    itivity
    -0.06
    ++]=
    -0.06
    	status
    -0.06
     хочу
    -0.06
    -0.06
    _indicator
    -0.06
     :)↵↵
    -0.06
     dozens
    -0.06
    POSITIVE LOGITS
     catastrophe
    0.07
    čná
    0.07
    фици
    0.06
    .Once
    0.06
     Cannabis
    0.06
    igrated
    0.06
     Nhật
    0.06
     Comey
    0.06
     projektu
    0.06
    .addAttribute
    0.06
    Act Density 0.033%

    No Known Activations