INDEX
Explanations
instances of transitional phrases or verbs indicating movement towards various outcomes
New Auto-Interp
Negative Logits
both
-0.18
both
-0.15
-END
-0.14
everything
-0.14
urovision
-0.13
\Url
-0.13
neas
-0.13
215
-0.13
stp
-0.13
umbnail
-0.13
POSITIVE LOGITS
gre
0.21
æĸ°çļĦ
0.20
becomes
0.19
new
0.19
become
0.18
became
0.18
being
0.18
becoming
0.17
embrace
0.17
Became
0.17
Activations Density 0.143%