INDEX
Explanations
phrases related to measuring distance or progress
expressions related to progress or distance towards a goal
New Auto-Interp
Negative Logits
oly
-0.66
pecially
-0.63
itu
-0.63
variable
-0.62
Guest
-0.61
ixture
-0.60
Var
-0.59
liction
-0.59
pots
-0.58
ores
-0.58
POSITIVE LOGITS
tread
0.94
traveled
0.86
progressed
0.83
travelled
0.82
stairs
0.82
travers
0.78
slope
0.77
toward
0.77
traverse
0.76
towards
0.75
Activations Density 0.125%