INDEX
Explanations
phrases or sentences describing initial actions or stages in a process
references to "first steps" or initial actions in various contexts
New Auto-Interp
Negative Logits
ores
-0.75
iqueness
-0.69
Sins
-0.69
olls
-0.69
inately
-0.67
anguages
-0.66
contracted
-0.66
licted
-0.66
ittens
-0.66
poons
-0.65
POSITIVE LOGITS
daughter
0.99
Steps
0.94
step
0.91
step
0.91
Step
0.91
steps
0.90
steps
0.86
toward
0.85
dad
0.84
forward
0.81
Activations Density 0.018%