INDEX
Explanations
time-related expressions indicating a short duration
phrases indicating a temporal context or duration
New Auto-Interp
Negative Logits
anus
-0.69
911
-0.65
less
-0.65
theless
-0.64
2020
-0.63
Osw
-0.62
netflix
-0.61
smith
-0.60
ukong
-0.59
locking
-0.59
POSITIVE LOGITS
ovych
0.80
FTWARE
0.67
bered
0.67
othes
0.66
alions
0.63
oké
0.62
forth
0.61
rou
0.61
oths
0.60
ambers
0.60
Activations Density 0.014%