INDEX
Explanations
dates indicated in the month and day format
specific dates or mentions of the month of March
New Auto-Interp
Negative Logits
cumbers
-0.74
unnecess
-0.68
¥ŀ
-0.66
Reviewer
-0.64
eleph
-0.63
gone
-0.62
gifted
-0.61
neoc
-0.61
looph
-0.61
selective
-0.61
POSITIVE LOGITS
Madness
0.97
2019
0.96
2015
0.94
2018
0.93
2017
0.93
flower
0.86
2016
0.85
2021
0.85
2013
0.84
nard
0.84
Activations Density 0.031%