INDEX
Explanations
instances of phrasing related to actions or recommendations involving "you" and "to."
New Auto-Interp
Negative Logits
Doing
-0.23
Doing
-0.21
DONE
-0.17
doing
-0.17
Done
-0.17
/do
-0.17
esModule
-0.16
rez
-0.16
VERRIDE
-0.15
done
-0.15
POSITIVE LOGITS
di
0.23
-d
0.22
.d
0.22
d
0.21
dose
0.20
due
0.20
du
0.20
dot
0.20
dee
0.20
does
0.20
Activations Density 0.089%