INDEX
Explanations
timestamps or dates in a specific format
instances of the word "posted" and related terms indicating publication dates or editing actions
New Auto-Interp
Negative Logits
abouts
-0.80
rium
-0.67
ovan
-0.66
knit
-0.64
oter
-0.62
acl
-0.61
rouse
-0.60
quin
-0.60
zos
-0.60
gars
-0.59
POSITIVE LOGITS
Next
0.70
Published
0.62
Lear
0.61
icio
0.61
Manga
0.60
javascript
0.60
Posts
0.59
Jump
0.58
Tue
0.57
Posted
0.57
Activations Density 0.055%