INDEX
Explanations
academic citations and references within scientific text
New Auto-Interp
Negative Logits
otherwise
-0.15
ori
-0.14
you
-0.14
afterward
-0.14
Particularly
-0.13
afterwards
-0.13
ettel
-0.13
everything
-0.13
apr
-0.13
olarity
-0.13
POSITIVE LOGITS
recent
0.34
recent
0.30
recently
0.28
Recent
0.27
Recent
0.26
Recently
0.26
Recently
0.26
Until
0.26
until
0.25
Until
0.25
Activations Density 0.131%