INDEX
Explanations
time-related words and durations
New Auto-Interp
Negative Logits
emale
-0.82
Reviewer
-0.80
acus
-0.77
liction
-0.73
acted
-0.73
obook
-0.68
ocaly
-0.68
nels
-0.67
type
-0.64
duct
-0.64
POSITIVE LOGITS
ago
1.45
Ago
1.41
elapsed
1.00
later
0.97
Later
0.89
Ahead
0.88
shy
0.87
ahead
0.85
transpired
0.77
Away
0.77
Activations Density 0.067%