INDEX
Explanations
adjectives or adverbs describing a unique or distinctive quality
words that convey uniqueness or distinctiveness
New Auto-Interp
Negative Logits
OTOS
-0.77
flaws
-0.68
adoption
-0.67
quickShipAvailable
-0.67
opposition
-0.66
Reviewer
-0.65
demolition
-0.65
conclusions
-0.64
substitution
-0.63
Onion
-0.62
POSITIVE LOGITS
suited
0.82
responsible
0.78
narrated
0.78
differentiated
0.75
tailored
0.74
positioned
0.72
tuned
0.72
situated
0.71
trained
0.71
consulted
0.70
Activations Density 0.018%