INDEX
Explanations
phrases indicating temporal sequence
sequential actions and events in narratives
New Auto-Interp
Negative Logits
currently
-0.66
formerly
-0.61
prior
-0.61
affair
-0.60
gorilla
-0.59
nowadays
-0.58
lately
-0.57
Earlier
-0.56
unlike
-0.55
today
-0.55
POSITIVE LOGITS
EEK
0.76
maxwell
0.73
proceeded
0.71
bered
0.68
raq
0.67
leep
0.67
iphany
0.67
GGGG
0.66
AAAAAAAA
0.66
rosis
0.66
Activations Density 0.506%