INDEX
    Explanations

    questions asked with "aware of" or "best for"

    New Auto-Interp
    Negative Logits
     we
    1.12
     nobody
    1.08
     simply
    1.07
     nonetheless
    1.04
     doesn
    1.03
     there
    1.03
     don
    1.00
     since
    0.99
     here
    0.98
     she
    0.97
    POSITIVE LOGITS
     Of
    2.86
     For
    2.80
     And
    2.67
     To
    2.56
     With
    2.53
     On
    2.52
     At
    2.39
     From
    2.36
     By
    2.35
    For
    2.29
    Act Density 1.490%

    No Known Activations