INDEX
Explanations
phrases related to causing offense or being offended
expressions related to causing offense
New Auto-Interp
Negative Logits
aver
-0.78
ynthesis
-0.77
ilk
-0.76
liner
-0.69
Requirements
-0.67
aws
-0.67
arijuana
-0.67
achine
-0.66
ynchron
-0.65
achev
-0.64
POSITIVE LOGITS
offend
1.31
offended
1.20
offending
1.16
insulted
1.06
blasp
0.92
insult
0.87
indecent
0.82
olini
0.81
Sax
0.80
Û
0.77
Activations Density 0.012%