INDEX
Explanations
Riots and clashes
This neuron detects occurrences of the word “blasphemy” (and its sub‐token parts) in the text.
references to communal violence or riots, particularly involving Hindus and Muslims.
New Auto-Interp
Negative Logits
الثاني
-0.07
Міністер
-0.07
itivity
-0.06
++]=
-0.06
status
-0.06
хочу
-0.06
ッ
-0.06
_indicator
-0.06
:)↵↵
-0.06
dozens
-0.06
POSITIVE LOGITS
catastrophe
0.07
čná
0.07
фици
0.06
.Once
0.06
Cannabis
0.06
igrated
0.06
Nhật
0.06
Comey
0.06
projektu
0.06
.addAttribute
0.06
Activations Density 0.033%