INDEX
Explanations
timestamps or time references within text
instances of temporal markers related to the act of writing
New Auto-Interp
Negative Logits
igham
-0.73
heck
-0.68
bandwagon
-0.67
impe
-0.66
sacrific
-0.65
streng
-0.64
disob
-0.64
ogie
-0.63
benefit
-0.63
Cosponsors
-0.61
POSITIVE LOGITS
cember
0.72
info
0.71
redacted
0.70
ctl
0.69
estimates
0.65
estimate
0.64
oth
0.62
rox
0.61
uary
0.61
Updated
0.61
Activations Density 0.066%