INDEX
Explanations
concepts related to influence and guidance
Verbs followed by pronouns
showing or causing outcomes
New Auto-Interp
Negative Logits
Yourself
-0.82
thyself
-0.81
yourself
-0.77
yourself
-0.76
expandindo
-0.76
himſelf
-0.75
Myself
-0.75
myſelf
-0.73
ourselves
-0.72
yourselves
-0.72
POSITIVE LOGITS
us
0.67
both
0.67
nicely
0.61
him
0.61
itself
0.60
the
0.59
me
0.57
many
0.56
CreateTagHelper
0.56
our
0.54
Activations Density 0.717%