INDEX
Explanations
positive sentiments and expressions of appreciation
New Auto-Interp
Negative Logits
coerc
-0.75
toggle
-0.72
delinquent
-0.70
restraining
-0.68
prohibitions
-0.65
coercive
-0.65
vanish
-0.65
defaults
-0.64
vanishing
-0.63
Nope
-0.63
POSITIVE LOGITS
thank
1.23
congratulate
1.19
congratulated
1.12
thanked
1.10
compliment
1.09
congr
1.04
Congratulations
1.03
congratulations
1.02
THANK
1.00
compliments
0.98
Activations Density 0.583%