INDEX
Explanations
phrases indicating effort or attempt
expressions of effort or commitment to doing one's best
New Auto-Interp
Negative Logits
usted
-0.81
UST
-0.69
arters
-0.65
resy
-0.62
lvl
-0.61
Sands
-0.59
resent
-0.59
oston
-0.59
plur
-0.58
èĢħ
-0.58
POSITIVE LOGITS
approximation
0.84
imitation
0.82
impression
0.78
job
0.76
endeav
0.74
effort
0.74
imperson
0.71
guess
0.70
rendition
0.69
to
0.67
Activations Density 0.032%