INDEX
Explanations
time periods or dates
instances of the word "to" and related phrases indicating time or a range
New Auto-Interp
Negative Logits
Planet
-0.63
java
-0.62
earable
-0.61
xual
-0.61
riad
-0.60
framework
-0.59
ourke
-0.57
Entity
-0.57
utenberg
-0.56
erman
-0.56
POSITIVE LOGITS
December
1.28
June
1.27
April
1.27
October
1.27
September
1.26
March
1.26
August
1.24
July
1.22
February
1.20
November
1.19
Activations Density 0.047%