INDEX
Explanations
dates in the format of month, day, and year
New Auto-Interp
Negative Logits
Genie
-0.60
aghetti
-0.55
tut
-0.54
habitual
-0.54
hobbies
-0.53
igans
-0.52
ioch
-0.52
lawy
-0.52
loves
-0.52
wand
-0.51
POSITIVE LOGITS
th
1.52
rd
1.14
ths
1.09
TH
0.95
teenth
0.90
eteenth
0.88
ember
0.86
tha
0.83
nd
0.82
2017
0.80
Activations Density 0.645%