INDEX
Explanations
dates related to specific events in news articles
references to specific dates, particularly in July and August
New Auto-Interp
Negative Logits
exha
-0.70
corn
-0.66
cumbers
-0.64
unnecess
-0.63
ancest
-0.62
calibr
-0.61
fect
-0.60
withd
-0.58
hyde
-0.58
regress
-0.58
POSITIVE LOGITS
31
1.24
29
1.08
28
1.03
26
1.02
iano
1.01
27
1.01
23
0.99
Fourth
0.98
22
0.97
24
0.96
Activations Density 0.051%