INDEX
Explanations
sentence end indicators followed by "but"
New Auto-Interp
Negative Logits
methodologies
0.35
activations
0.32
deliverables
0.31
analogs
0.31
datasets
0.31
applications
0.30
alphan
0.30
Wrapper
0.29
analogues
0.29
benchmarks
0.29
POSITIVE LOGITS
you
0.40
but
0.39
why
0.39
Wasn
0.39
didn
0.38
That
0.38
wouldn
0.38
yes
0.37
who
0.37
maybe
0.37
Activations Density 0.103%