INDEX
Explanations
phrases related to tasks or activities involving effort or physical labor
terms related to work and tasks
New Auto-Interp
Negative Logits
respons
-0.79
uncond
-0.70
congr
-0.69
burnt
-0.65
bang
-0.65
Reviewer
-0.62
dime
-0.61
positively
-0.61
dstg
-0.59
cogn
-0.58
POSITIVE LOGITS
theless
1.06
tenance
0.79
icity
0.73
ments
0.73
ment
0.70
iculty
0.70
recy
0.70
Hide
0.68
lihood
0.67
ionage
0.67
Activations Density 0.161%