INDEX
Explanations
the repetition of the word "been" across various contexts
New Auto-Interp
Negative Logits
soon
-0.61
still
-0.60
being
-0.55
being
-0.55
ftill
-0.54
soon
-0.54
tomorrow
-0.52
Being
-0.52
still
-0.51
not
-0.51
POSITIVE LOGITS
depuis
0.73
since
0.73
through
0.69
fince
0.69
sejak
0.67
steadily
0.65
since
0.65
Through
0.61
SINCE
0.60
以來
0.60
Activations Density 0.195%