INDEX
    Explanations

    the use of the word "You" or phrases that directly address the reader

    New Auto-Interp
    Negative Logits
     airs
    -0.63
    ipal
    -0.62
    assembly
    -0.61
     srfAttach
    -0.58
    ice
    -0.58
     Commerce
    -0.57
    Lago
    -0.57
    actic
    -0.57
    éĹ
    -0.56
     ammon
    -0.56
    POSITIVE LOGITS
    're
    1.43
    've
    1.26
    'll
    1.23
     guys
    1.10
    'd
    1.05
    tub
    1.00
    ngth
    0.93
     gotta
    0.92
    ldon
    0.91
    ths
    0.90
    Act Density 0.118%

    No Known Activations