INDEX
Explanations
dates or times mentioned in the text
date references, specifically "Sep" followed by numerical values indicating days
New Auto-Interp
Negative Logits
blows
-0.71
millions
-0.64
careless
-0.61
unprepared
-0.61
bearer
-0.60
unauthorized
-0.59
unf
-0.59
teeth
-0.58
liking
-0.57
Mother
-0.55
POSITIVE LOGITS
arate
1.68
aration
1.58
arat
1.47
sis
1.22
arations
1.20
ar
1.04
hard
1.01
ul
0.99
hi
0.98
mented
0.97
Activations Density 0.030%