INDEX
Explanations
phrases indicating a desire or intention to do something
expressions of desire or intention
New Auto-Interp
Negative Logits
stand
-0.67
bug
-0.65
ibliography
-0.64
trust
-0.64
aka
-0.61
®
-0.60
alias
-0.60
voc
-0.59
manship
-0.59
iop
-0.57
POSITIVE LOGITS
clarification
0.79
revenge
0.77
to
0.72
something
0.71
answers
0.66
permission
0.66
clarity
0.64
desperately
0.61
attention
0.61
":[{"0.60
Activations Density 0.093%