INDEX
Explanations
dates in the format month year
occurrences of the word "of" and related phrases indicating temporal context
New Auto-Interp
Negative Logits
pread
-0.65
sympath
-0.60
inctions
-0.55
ado
-0.55
cro
-0.55
ioned
-0.54
irs
-0.54
inherit
-0.53
roar
-0.53
Reviewer
-0.53
POSITIVE LOGITS
2014
1.12
2015
1.10
2013
1.09
2017
1.06
2012
1.05
2016
1.05
2011
1.03
2009
1.01
1963
0.99
2018
0.94
Activations Density 0.038%