INDEX
    Explanations

    asking why something is important

    New Auto-Interp
    Negative Logits
    :
    1.00
    .:
    0.95
    ”:
    0.91
    0.91
    :”
    0.86
    ?:
    0.86
    :]
    0.86
    .]:
    0.85
    ’:
    0.85
     she
    0.84
    POSITIVE LOGITS
     bother
    1.68
     bothered
    1.13
     bothering
    1.03
     hassle
    1.00
     bothers
    0.97
     Anywhere
    0.90
     nuisance
    0.86
     Biological
    0.86
     invoke
    0.86
     burden
    0.84
    Act Density 0.017%

    No Known Activations