INDEX
    Explanations

    phrases indicating uncertainty or speculative statements

    New Auto-Interp
    Negative Logits
    former
    -0.61
    ouf
    -0.60
    noon
    -0.57
    quartered
    -0.56
    76561
    -0.56
     AAA
    -0.55
    kie
    -0.55
     Saying
    -0.54
     Coffin
    -0.52
     Trouble
    -0.52
    POSITIVE LOGITS
     beh
    1.50
     seems
    1.30
     becomes
    1.24
     begs
    1.23
     appears
    1.16
     shouldn
    1.13
     makes
    1.06
     feels
    1.04
    unes
    1.04
     hurts
    1.03
    Act Density 0.150%

    No Known Activations