INDEX
Explanations
mentions of specific days of the week, particularly "Thursday."
New Auto-Interp
Negative Logits
Sundays
-0.24
mon
-0.23
Mon
-0.22
Sat
-0.22
mon
-0.21
saturation
-0.21
Saturdays
-0.21
weekends
-0.20
Mon
-0.20
Mondays
-0.20
POSITIVE LOGITS
Thursday
0.88
Thursday
0.84
Thurs
0.68
Thu
0.60
Thu
0.53
ursday
0.49
Thur
0.38
38
0.38
48
0.35
98
0.33
Activations Density 0.037%