INDEX
Explanations
adjective-noun combinations that express negative judgement
negative descriptors related to societal issues and behaviors
New Auto-Interp
Negative Logits
ften
-0.73
ittee
-0.71
anwhile
-0.70
THERE
-0.68
smelled
-0.66
ilk
-0.64
ocket
-0.64
Originally
-0.63
uden
-0.63
orest
-0.63
POSITIVE LOGITS
situations
0.93
proportions
0.84
amounts
0.83
ideas
0.80
ones
0.80
levels
0.79
interpretations
0.79
circumstances
0.78
opportunities
0.77
truths
0.77
Activations Density 0.502%