INDEX
Explanations
dates written in the format "February [day]"
dates, particularly in February
New Auto-Interp
Negative Logits
opic
-0.63
achu
-0.61
cumbers
-0.60
kef
-0.60
behavi
-0.59
ioch
-0.59
ographed
-0.56
elephant
-0.55
delinquent
-0.55
parap
-0.55
POSITIVE LOGITS
nd
1.26
Madness
0.93
2019
0.92
2017
0.85
ruary
0.84
2016
0.84
2018
0.82
2015
0.81
ricks
0.81
mber
0.80
Activations Density 0.022%