INDEX
Explanations
phrases expressing ambition and dedication
New Auto-Interp
Negative Logits
weg
-0.16
most
-0.15
/by
-0.15
quier
-0.15
/up
-0.15
atee
-0.14
/from
-0.14
xit
-0.14
ritten
-0.14
like
-0.14
POSITIVE LOGITS
harder
0.26
towards
0.25
toward
0.24
hardest
0.23
-hard
0.22
hard
0.21
Towards
0.19
hard
0.19
Towards
0.18
ToFit
0.18
Activations Density 0.012%