INDEX
Explanations
descriptive words conveying a negative characteristic or feeling
instances of the word "nasty" and related negative descriptors
New Auto-Interp
Negative Logits
HCR
-0.85
inet
-0.83
ingham
-0.82
inoa
-0.76
ETF
-0.76
aver
-0.75
produced
-0.74
inez
-0.73
oning
-0.72
Particip
-0.72
POSITIVE LOGITS
nasty
1.15
surprises
1.03
adolesc
1.02
earthqu
0.87
spoil
0.85
ugly
0.83
undermin
0.79
poisonous
0.79
beasts
0.77
smelling
0.77
Activations Density 0.010%