INDEX
Explanations
sentences that mention the concept of work or task completion
New Auto-Interp
Negative Logits
ister
-0.69
clair
-0.68
akening
-0.67
arium
-0.67
omew
-0.66
epad
-0.65
ridge
-0.64
ename
-0.63
nonetheless
-0.62
wen
-0.62
POSITIVE LOGITS
bells
0.95
fuss
0.91
facets
0.88
goodies
0.85
hoop
0.83
things
0.80
ingredients
0.79
stuff
0.79
usual
0.78
components
0.77
Activations Density 0.638%