INDEX
Explanations
phrases or sentences indicating a consequential event or action
punctuation marks, specifically periods at the end of sentences
New Auto-Interp
Negative Logits
thal
-0.79
itan
-0.73
pecially
-0.72
tones
-0.70
eatured
-0.70
culus
-0.70
cipled
-0.69
yssey
-0.67
pit
-0.67
ior
-0.67
POSITIVE LOGITS
Eventually
1.18
Again
1.06
Then
1.06
Later
1.05
And
1.04
Soon
1.03
Ultimately
1.01
Slowly
1.00
But
1.00
Literally
0.99
Activations Density 0.794%