INDEX
    Explanations

    the word "what" in various contexts

    the phrase "do what" in various contexts

    New Auto-Interp
    Negative Logits
    UTH
    -0.70
     Returning
    -0.64
    por
    -0.64
    voy
    -0.63
    diagn
    -0.61
    uttering
    -0.59
    lic
    -0.59
     Returns
    -0.58
    ipel
    -0.58
    war
    -0.58
    POSITIVE LOGITS
    soever
    1.18
     happens
    0.82
     happened
    0.77
     they
    0.72
     mattered
    0.72
    necessary
    0.71
     else
    0.69
     amounted
    0.69
    andom
    0.66
    idth
    0.65
    Act Density 0.054%

    No Known Activations