INDEX
    Explanations

    phrases expressing negative sentiment or doubt

    New Auto-Interp
    Negative Logits
     is
    -0.61
    i
    -0.60
    a
    -0.59
    s
    -0.56
    tels
    -0.56
    k
    -0.56
    y
    -0.54
     ist
    -0.53
    b
    -0.53
    l
    -0.53
    POSITIVE LOGITS
     wouldn
    1.82
    wouldn
    1.69
     Wouldn
    1.51
    Wouldn
    1.35
     wouldnt
    1.32
     shouldn
    1.16
     Shouldn
    1.14
     unlikely
    1.11
     GenerationType
    1.07
    CloseOperation
    1.03
    Act Density 0.060%

    No Known Activations