INDEX
    Explanations

    phrases that express doubt or uncertainty

    phrases expressing potential difficulties or challenges

    New Auto-Interp
    Negative Logits
    ŃĶ
    -0.73
    ppo
    -0.70
    çīĪ
    -0.64
    ļéĨĴ
    -0.64
    ulner
    -0.64
    etheless
    -0.64
    IRT
    -0.63
    ĸļ
    -0.63
    Published
    -0.63
    ãĤ¸
    -0.61
    POSITIVE LOGITS
     but
    1.05
     BUT
    0.84
     though
    0.82
     tho
    0.82
     But
    0.81
     anymore
    0.80
    but
    0.78
     however
    0.74
    But
    0.73
     yet
    0.73
    Act Density 0.728%

    No Known Activations