INDEX
Explanations
dates or days of the week, specifically focusing on Saturdays
references to "Saturday" and related contexts
New Auto-Interp
Negative Logits
arent
-0.73
superflu
-0.71
polyg
-0.69
fixation
-0.68
correctional
-0.66
actu
-0.65
lessly
-0.64
diffusion
-0.63
vit
-0.63
ensing
-0.63
POSITIVE LOGITS
Saturday
1.18
Night
1.06
Night
0.96
afternoon
0.94
Friday
0.91
Monday
0.91
nights
0.89
Sunday
0.89
morning
0.87
Showdown
0.86
Activations Density 0.003%