INDEX
Explanations
sentences with a focus on self-improvement and feedback
New Auto-Interp
Negative Logits
painfully
-0.76
unsus
-0.75
swiftly
-0.74
formally
-0.72
timely
-0.71
neatly
-0.71
thorough
-0.71
rapidly
-0.70
unab
-0.69
inform
-0.69
POSITIVE LOGITS
Somebody
1.37
Obviously
1.36
Everybody
1.36
Especially
1.36
And
1.34
Whereas
1.34
Hopefully
1.31
Anyway
1.30
Because
1.29
Sometimes
1.28
Activations Density 0.423%