INDEX
    Explanations

    words related to uncertainty or degrees of certainty

    New Auto-Interp
    Negative Logits
    ires
    -0.76
    ocracy
    -0.69
     Pony
    -0.68
    ETS
    -0.66
    ocratic
    -0.63
    ocrats
    -0.60
    efer
    -0.60
    estro
    -0.59
     Wast
    -0.57
    eval
    -0.57
    POSITIVE LOGITS
    chard
    1.61
    acle
    1.49
    acles
    1.48
    Else
    1.46
    chid
    1.41
    nam
    1.40
    ifice
    1.39
    acular
    1.33
     alternatively
    1.32
     else
    1.21
    Act Density 1.052%

    No Known Activations