INDEX
Explanations
dates, specifically those in November
New Auto-Interp
Negative Logits
king
-0.83
ldon
-0.81
gotten
-0.76
jriwal
-0.76
pires
-0.76
cher
-0.74
urdy
-0.71
ught
-0.71
utenberg
-0.71
ked
-0.71
POSITIVE LOGITS
2012
0.85
2014
0.85
2015
0.82
2016
0.81
2013
0.80
е
0.80
1942
0.77
1941
0.77
2010
0.76
2011
0.76
Activations Density 0.022%