INDEX
Explanations
words related to causing offense or being offended
expressions related to feelings of offense
New Auto-Interp
Negative Logits
gravity
-0.77
issue
-0.73
rocket
-0.72
liner
-0.71
tom
-0.71
runner
-0.70
nosis
-0.66
ports
-0.66
pring
-0.66
hare
-0.65
POSITIVE LOGITS
offend
0.98
offended
0.93
insulted
0.80
Yiannopoulos
0.79
indecent
0.78
bystanders
0.76
netflix
0.71
sensibilities
0.70
ĸļ
0.69
Cartoon
0.69
Activations Density 0.026%