INDEX
    Explanations

    phrases indicating uncertainty or indecision

    phrases expressing confusion or uncertainty about actions or decisions

    New Auto-Interp
    Negative Logits
    urance
    -0.62
    members
    -0.61
    quad
    -0.60
     Confeder
    -0.58
     Vox
    -0.57
    artisan
    -0.57
    ĸļ
    -0.57
     vouchers
    -0.57
     Nurs
    -0.57
    creator
    -0.56
    POSITIVE LOGITS
     eat
    1.01
     choose
    0.95
     give
    0.94
     take
    0.94
     characterize
    0.93
     interpret
    0.92
     classify
    0.92
     blame
    0.92
     inflict
    0.92
     expect
    0.90
    Act Density 0.066%

    No Known Activations