INDEX
    Explanations

    uncertainty or lack of knowledge, especially in the form of questions

    expressions of uncertainty or lack of knowledge

    New Auto-Interp
    Negative Logits
    ueller
    -0.71
     exting
    -0.71
    oided
    -0.66
    ammers
    -0.65
    wagen
    -0.64
    stro
    -0.63
    ccording
    -0.62
    aez
    -0.62
    onding
    -0.61
    tein
    -0.61
    POSITIVE LOGITS
     why
    1.34
     how
    1.32
     whether
    1.27
     what
    1.19
     if
    1.16
     anything
    1.06
     anymore
    1.04
     WHY
    1.04
    why
    1.01
     exactly
    1.01
    Act Density 0.041%

    No Known Activations