INDEX
    Explanations

    words and phrases that express certainty or emphasis

    preceding present-tense verbs

    adverb followed by verb

    New Auto-Interp
    Negative Logits
     serem
    -0.63
     sighing
    -0.52
     clarification
    -0.52
     fouling
    -0.50
     spotting
    -0.50
     scrambling
    -0.49
     informing
    -0.49
     tracing
    -0.47
     whispering
    -0.47
     robbing
    -0.46
    POSITIVE LOGITS
     has
    1.07
     got
    1.04
     had
    1.03
     is
    1.00
     took
    0.99
     can
    0.97
     went
    0.96
     seems
    0.94
     makes
    0.92
     came
    0.92
    Act Density 0.312%

    No Known Activations