INDEX
Explanations
dates followed by blog post times
instances of the word "Posted" followed by a date or time reference
New Auto-Interp
Negative Logits
atern
-0.81
ugal
-0.80
ively
-0.73
ifted
-0.72
fell
-0.72
isky
-0.71
ishi
-0.71
ppard
-0.71
roo
-0.70
stood
-0.70
POSITIVE LOGITS
Posted
0.97
Thumbnails
0.83
Comments
0.73
monton
0.73
vertis
0.72
Prediction
0.67
Helpful
0.66
itors
0.66
siege
0.65
erick
0.65
Activations Density 0.008%