INDEX
    Explanations

    phrases indicating confusion or uncertainty

    references to a lack of knowledge or uncertainty

    New Auto-Interp
    Negative Logits
    iki
    -0.74
    inka
    -0.71
    Reviewed
    -0.68
    ouri
    -0.67
    inos
    -0.66
    Pers
    -0.64
    istant
    -0.63
    ansk
    -0.63
    conn
    -0.62
    visor
    -0.60
    POSITIVE LOGITS
     whatsoever
    0.98
     how
    0.97
     why
    0.95
     whats
    0.82
     what
    0.80
     whence
    0.80
    ledged
    0.79
    why
    0.77
     squat
    0.75
     WHY
    0.74
    Act Density 0.036%

    No Known Activations