INDEX
Explanations
references to events or actions that occurred a specific number of years "after" a past event
phrases indicating a passage of time or events occurring after a specific point
New Auto-Interp
Negative Logits
uci
-0.76
ace
-0.72
atic
-0.71
ctive
-0.70
atically
-0.70
owed
-0.68
cin
-0.68
ac
-0.67
cue
-0.67
chio
-0.66
POSITIVE LOGITS
completing
0.93
encountering
0.86
completion
0.83
arriving
0.81
rejecting
0.78
receipt
0.78
quitting
0.77
acquiring
0.77
launching
0.75
receiving
0.74
Activations Density 0.042%