INDEX
Explanations
phrases indicating comparison or assessment of quality
phrases indicating strength or intensity of actions or states
New Auto-Interp
Negative Logits
":""},{"-0.72
livious
-0.62
moderator
-0.61
ctions
-0.61
Reviewer
-0.61
Coun
-0.60
aron
-0.60
UCT
-0.60
éĸ
-0.58
æ©
-0.57
POSITIVE LOGITS
possibly
0.90
feas
0.78
Possible
0.71
dared
0.69
pleased
0.68
practicable
0.65
bler
0.63
reasonably
0.62
wished
0.62
tires
0.61
Activations Density 0.075%