INDEX
    Explanations

    time-related words, such as days, weeks, and months

    New Auto-Interp
    Negative Logits
    inav
    -0.82
    urities
    -0.77
    versely
    -0.71
    reme
    -0.68
    ifice
    -0.67
    jri
    -0.66
    untled
    -0.65
     Unc
    -0.63
    icator
    -0.63
    olini
    -0.62
    POSITIVE LOGITS
     ago
    1.17
    '
    1.07
     apiece
    0.98
    hops
    0.91
     gestation
    0.90
    pring
    0.89
     Ago
    0.88
     consecut
    0.86
    '/
    0.84
    hift
    0.83
    Act Density 0.121%

    No Known Activations