INDEX
Explanations
mentions of specific times of day and related temporal phrases
New Auto-Interp
Negative Logits
daytime
-0.19
mornings
-0.19
Day
-0.17
Morning
-0.17
day
-0.17
morning
-0.17
nighttime
-0.17
i
-0.17
s
-0.16
Evening
-0.15
POSITIVE LOGITS
/e
0.25
üstü
0.20
long
0.19
-long
0.18
ÙħÛĮÙĦادÛĮ
0.17
kü
0.16
steen
0.16
mares
0.16
-after
0.16
/y
0.16
Activations Density 0.028%