INDEX
    Explanations

    instances where the reader is directly addressed or instructed to take a particular action

    the word "you" in various contexts

    New Auto-Interp
    Negative Logits
    ipal
    -0.82
    Lago
    -0.69
    emon
    -0.66
    weight
    -0.64
    acular
    -0.62
     Kemp
    -0.62
    ortality
    -0.61
    ģ«
    -0.61
    efe
    -0.60
     Parameters
    -0.59
    POSITIVE LOGITS
    're
    1.23
     guys
    1.18
    tub
    1.03
    'll
    1.00
     mileage
    0.95
    've
    0.94
    RS
    0.84
    'd
    0.84
     yourselves
    0.82
    tu
    0.82
    Act Density 0.217%

    No Known Activations