INDEX
Explanations
temporal references related to events and dates
New Auto-Interp
Negative Logits
February
-0.20
December
-0.20
January
-0.20
November
-0.19
October
-0.19
July
-0.19
September
-0.18
April
-0.17
June
-0.17
March
-0.16
POSITIVE LOGITS
fe
0.44
j
0.43
may
0.40
jan
0.33
oct
0.32
august
0.31
march
0.30
dec
0.30
may
0.30
sept
0.30
Activations Density 0.249%