INDEX
    Explanations

    phrases indicating time durations or periods of time

    New Auto-Interp
    Negative Logits
     kasarigan
    -0.73
     mobilen
    -0.54
    زاد
    -0.52
     Jefferson
    -0.49
     Hinton
    -0.48
     Dahl
    -0.48
    ACD
    -0.48
     Beaumont
    -0.47
     verwij
    -0.47
    mson
    -0.47
    POSITIVE LOGITS
     over
    1.25
     OVER
    1.09
     Over
    1.04
    Over
    0.95
    mergeFrom
    0.95
    over
    0.95
     hinweg
    0.92
    tover
    0.89
     ModelExpression
    0.88
     across
    0.88
    Act Density 0.128%

    No Known Activations