INDEX
Explanations
phrases indicating a specific timeframe or context in the writing
New Auto-Interp
Negative Logits
eph
-0.13
Overall
-0.13
ouv
-0.13
ansch
-0.12
ouis
-0.12
jah
-0.12
Overall
-0.12
ucch
-0.12
overall
-0.12
Ä±ÅŁma
-0.12
POSITIVE LOGITS
press
0.34
publication
0.26
writing
0.26
publishing
0.26
press
0.24
yet
0.23
_press
0.22
publication
0.22
PRESS
0.21
presses
0.21
Activations Density 0.026%