INDEX
Explanations
verb phrases indicating starting or embarking on a journey or task
phrases indicating the initiation of actions or journeys
New Auto-Interp
Negative Logits
eries
-0.64
aird
-0.62
ery
-0.60
Kings
-0.58
leness
-0.58
illard
-0.56
avorable
-0.55
Marketable
-0.55
itu
-0.55
oir
-0.55
POSITIVE LOGITS
anew
0.90
fitted
0.88
to
0.83
towards
0.80
boldly
0.77
toward
0.77
posts
0.75
exploring
0.70
upon
0.69
tracks
0.69
Activations Density 0.043%