INDEX
Explanations
phrases related to self-improvement and personal growth
statements and themes related to self-improvement and accountability
New Auto-Interp
Negative Logits
Reconstruction
-0.61
Bak
-0.61
)]
-0.61
Annex
-0.59
ilts
-0.58
Latest
-0.56
thodox
-0.53
ansky
-0.53
Intel
-0.53
Roy
-0.53
POSITIVE LOGITS
yourself
1.61
yourselves
1.36
Yourself
1.12
your
0.87
YOUR
0.80
oneself
0.78
your
0.71
Your
0.69
poke
0.68
ichever
0.66
Activations Density 0.846%