INDEX
Explanations
time-related phrases such as time durations and specific points in time
references to time duration and elapsed periods
New Auto-Interp
Negative Logits
ople
-0.80
emale
-0.72
anooga
-0.70
MODE
-0.70
nels
-0.67
plex
-0.67
minecraft
-0.66
duction
-0.66
subsistence
-0.65
liction
-0.65
POSITIVE LOGITS
Ago
1.28
ago
1.25
elapsed
0.85
shy
0.81
apiece
0.76
ccording
0.73
hindsight
0.72
Months
0.71
passed
0.70
long
0.69
Activations Density 0.106%