INDEX
    Explanations

    questions starting with "how" or "why"

    New Auto-Interp
    Negative Logits
    thur
    -0.73
    asant
    -0.71
    cedented
    -0.71
    wcsstore
    -0.69
    agonists
    -0.68
    heter
    -0.67
    dayName
    -0.67
     advoc
    -0.66
    esville
    -0.66
    itud
    -0.66
    POSITIVE LOGITS
    actic
    0.77
    ?]
    0.75
     happen
    0.67
     someone
    0.65
     astronomers
    0.64
    ={
    0.64
     emerge
    0.64
     somebody
    0.63
     evolve
    0.62
     mathematic
    0.62
    Act Density 0.019%

    No Known Activations