INDEX
Explanations
verbs or phrases related to actions or commands
action verbs related to processes or operations that involve utilization or benefits
New Auto-Interp
Negative Logits
sie
-0.64
Stage
-0.62
ricks
-0.61
anz
-0.60
pard
-0.60
stage
-0.60
rea
-0.59
Lauder
-0.58
Gentleman
-0.57
reluct
-0.57
POSITIVE LOGITS
neither
0.71
é¾įå¥ij士
0.65
nothing
0.63
otta
0.60
both
0.60
ulic
0.59
auna
0.59
itates
0.59
ļéĨĴ
0.58
araoh
0.57
Activations Density 0.366%