INDEX
Explanations
reward descriptions in a loyalty program
phrases related to rewards and incentives
New Auto-Interp
Negative Logits
[/
-0.85
.</
-0.81
</
-0.79
".[
-0.79
"},"
-0.77
®
-0.74
||||
-0.71
%"
-0.71
</
-0.70
Minecraft
-0.70
POSITIVE LOGITS
likewise
1.14
similarly
1.00
again
0.96
Again
0.86
also
0.83
again
0.78
another
0.78
additionally
0.78
Same
0.77
equally
0.74
Activations Density 0.788%