INDEX
Explanations
phrases that indicate the timing of events, particularly the word "Today" and its variations
New Auto-Interp
Negative Logits
rys
-0.17
isse
-0.15
embarrassment
-0.15
Osw
-0.15
earlier
-0.14
-to
-0.14
amus
-0.14
ote
-0.14
enders
-0.13
Walsh
-0.13
POSITIVE LOGITS
adays
0.16
повÑĸÑĤ
0.15
uom
0.15
tees
0.15
šak
0.15
ocket
0.14
ej
0.14
Spice
0.14
ripper
0.14
eÄį
0.14
Activations Density 0.061%