INDEX
Explanations
text related to movement or traveling to a specific direction or location
phrases indicating direction or movement
New Auto-Interp
Negative Logits
illon
-0.74
ancies
-0.68
rehens
-0.65
orum
-0.65
eria
-0.64
DonaldTrump
-0.64
pse
-0.63
ylon
-0.61
TON
-0.61
iability
-0.61
POSITIVE LOGITS
toward
1.13
towards
1.09
butt
0.97
canon
0.95
liner
0.86
downwards
0.82
lines
0.82
quarter
0.81
into
0.79
Ahead
0.78
Activations Density 0.045%