INDEX
Explanations
The neuron fires on mentions of cults or extremist‐religious groups (e.g. “cult,” “religious group,” “brainwashing,” “extremist”).
New Auto-Interp
Negative Logits
軍
-0.06
Pun
-0.06
军
-0.06
,n
-0.06
.tabs
-0.06
/widget
-0.06
ستی
-0.06
xab
-0.06
ist
-0.06
pymongo
-0.06
POSITIVE LOGITS
delaying
0.07
miracles
0.07
/>}↵
0.06
AdapterFactory
0.06
ektiv
0.06
signify
0.06
')==
0.06
disagreements
0.06
turnovers
0.06
'")↵
0.06
Activations Density 0.008%