INDEX
Explanations
phrases indicating duration or a sense of continuity over time
New Auto-Interp
Negative Logits
pai
-0.17
tdown
-0.16
dge
-0.16
ikit
-0.15
cestor
-0.14
iversit
-0.14
oretical
-0.13
inson
-0.13
ffect
-0.13
soon
-0.13
POSITIVE LOGITS
ago
0.41
since
0.34
since
0.27
ingly
0.27
ago
0.26
Since
0.24
ed
0.23
Since
0.23
-standing
0.23
Ago
0.22
Activations Density 0.017%