INDEX
    Explanations

    phrases related to decision-making and actions

    phrases related to time-sensitive events and decisions

    New Auto-Interp
    Negative Logits
    ãĤ©
    -0.66
    PLA
    -0.62
    verty
    -0.59
    rain
    -0.56
    ertain
    -0.56
    yle
    -0.55
     awa
    -0.54
    mac
    -0.54
    weeney
    -0.52
    raq
    -0.52
    POSITIVE LOGITS
     altogether
    2.30
     entirely
    1.71
     outright
    1.30
     completely
    1.13
     lest
    1.10
     because
    1.08
     until
    1.06
     whatsoever
    1.01
     indefinitely
    0.99
     anymore
    0.98
    Act Density 0.499%

    No Known Activations