INDEX
Explanations
references to extremist ideologies and their proponents
New Auto-Interp
Negative Logits
oredCriteria
-0.66
cshtml
-0.61
fjspx
-0.56
msgSender
-0.52
iastes
-0.52
referenties
-0.51
waard
-0.51
rawDesc
-0.51
writeField
-0.50
ExecuteAsync
-0.49
POSITIVE LOGITS
hate
1.02
Hate
0.86
extremist
0.86
hate
0.84
hateful
0.84
fascist
0.81
Hate
0.81
extrême
0.81
xen
0.81
extremism
0.80
Activations Density 0.436%