INDEX
    Explanations

    direct references to the reader using the word "You"

    instances of the word "You."

    New Auto-Interp
    Negative Logits
    wrapper
    -0.63
    itud
    -0.62
     theirs
    -0.60
     airs
    -0.60
    temp
    -0.58
    shore
    -0.57
     majority
    -0.56
     srfAttach
    -0.55
    ãĥ³ãĤ¸
    -0.55
     stemming
    -0.55
    POSITIVE LOGITS
    're
    1.15
    've
    1.08
    'll
    1.02
    Gov
    1.00
     guessed
    0.99
    ngth
    0.94
     Tube
    0.94
    imar
    0.91
     guys
    0.91
    ths
    0.90
    Act Density 0.108%

    No Known Activations