INDEX
Explanations
mentions of negatively viewed actions or states
words that are part of descriptive and emotive expressions, particularly adjectives and verbs that convey a strong action or feeling
descriptors associated with negative attributes or behaviors
New Auto-Interp
Negative Logits
awar
-0.51
{{-0.48
Ö
-0.45
cision
-0.43
Enlarge
-0.43
Reply
-0.43
":[
-0.41
Mandatory
-0.40
Newsletter
-0.40
natureconservancy
-0.40
POSITIVE LOGITS
notwithstanding
0.58
ãĥ³ãĤ¸
0.54
éĹĺ
0.52
preferably
0.50
nonetheless
0.50
respectively
0.49
netflix
0.48
=]
0.46
ItemThumbnailImage
0.46
conservancy
0.45
Activations Density 5.995%