INDEX
Explanations
negative descriptors related to moral judgment
extremely negative or offensive descriptors
New Auto-Interp
Negative Logits
+#+#
-0.75
tagHelperRunner
-0.72
ValueStyle
-0.71
ImageContext
-0.70
leneck
-0.69
-0.69
DockStyle
-0.69
Personendaten
-0.69
parsedMessage
-0.67
IUrlHelper
-0.65
POSITIVE LOGITS
vile
0.80
atrocious
0.79
disgusting
0.77
filthy
0.72
appalling
0.71
abominable
0.69
heinous
0.68
horrific
0.67
horrendous
0.66
despicable
0.66
Activations Density 0.017%