INDEX
    Explanations

    phrases related to time and duration

    New Auto-Interp
    Negative Logits
    old
    -0.16
    stag
    -0.16
    nd
    -0.16
    ogle
    -0.15
    ing
    -0.15
    /is
    -0.15
    ional
    -0.15
    orry
    -0.14
    stile
    -0.14
    ngr
    -0.14
    POSITIVE LOGITS
    -HT
    0.28
    teenth
    0.27
    teen
    0.25
    де
    0.24
    th
    0.21
    -star
    0.20
    Thirty
    0.19
    bread
    0.19
    tte
    0.18
    ylim
    0.18
    Act Density 0.178%

    No Known Activations