INDEX
Explanations
phrases indicating intent or purpose
occurrences of the word "to."
New Auto-Interp
Negative Logits
typ
-0.69
folded
-0.67
hops
-0.60
busted
-0.60
need
-0.58
hop
-0.57
drafts
-0.55
rolled
-0.55
packs
-0.55
Got
-0.54
POSITIVE LOGITS
ensure
1.18
ilet
1.17
ggles
1.10
avoid
1.08
satisfy
1.08
maximize
1.08
minimize
1.07
coincide
1.05
signify
1.05
achieve
1.04
Activations Density 0.460%