INDEX
Explanations
words related to expressing gratitude or understanding
terms related to apologies and appreciation
New Auto-Interp
Negative Logits
ersion
-0.93
agus
-0.81
enegger
-0.78
UNE
-0.75
UGC
-0.75
Increase
-0.74
Ranked
-0.73
define
-0.71
Minotaur
-0.71
RTX
-0.71
POSITIVE LOGITS
apolog
3.14
appreci
1.81
apolog
1.59
stal
1.46
desp
1.38
irre
1.25
coy
1.17
strateg
1.10
congrat
1.06
opportun
1.04
Activations Density 0.048%