INDEX
Explanations
words related to procedures or instructions
phrases indicating a sequence or set of actions
New Auto-Interp
Negative Logits
Mostly
-0.68
gyn
-0.66
Corpus
-0.66
arium
-0.63
itute
-0.63
fest
-0.62
Confederacy
-0.61
zeb
-0.59
Unlimited
-0.59
orter
-0.59
POSITIVE LOGITS
steps
3.94
Steps
2.86
step
2.33
steps
2.22
strides
2.19
step
1.85
Step
1.74
Step
1.64
stairs
1.62
footsteps
1.61
Activations Density 0.013%