INDEX
Explanations
words related to making promises or guarantees
New Auto-Interp
Negative Logits
asma
-0.70
Marginal
-0.70
reens
-0.67
rising
-0.66
externalToEVAOnly
-0.66
burn
-0.65
endar
-0.64
etsk
-0.63
totality
-0.63
aults
-0.62
POSITIVE LOGITS
recommend
1.10
Promise
1.03
promise
0.95
suggest
0.94
'm
0.89
guarantee
0.87
advise
0.85
'll
0.84
encourage
0.76
presume
0.76
Activations Density 0.216%