INDEX
    Explanations

    phrases indicating temporal references or changes related to events

    New Auto-Interp
    Negative Logits
     Lastly
    -0.86
    )))
    -0.81
    "}],"
    -0.78
    "]
    -0.77
    "))
    -0.77
    ))))
    -0.77
    DES
    -0.75
    Repeat
    -0.73
    "!
    -0.72
    trump
    -0.72
    POSITIVE LOGITS
     starters
    0.72
     industrialized
    0.69
    sofar
    0.65
     sooner
    0.62
     polls
    0.60
    verning
    0.60
     older
    0.60
    oret
    0.59
     fewer
    0.58
     longtime
    0.58
    Act Density 0.456%

    No Known Activations