INDEX
    Explanations

    instances of the word "you," indicating a focus on direct address or engagement with the reader

    New Auto-Interp
    Negative Logits
    was
    -0.22
     itself
    -0.17
    amp
    -0.15
    (s
    -0.15
     isnt
    -0.14
    ¤ëĭ¤
    -0.14
    Was
    -0.14
    ãģłãĤįãģĨ
    -0.14
     говоÑĢиÑĤ
    -0.14
    ìĿ´ëĭ¤
    -0.14
    POSITIVE LOGITS
    ’re
    0.61
    're
    0.55
    ’ve
    0.48
    've
    0.48
     are
    0.43
    ’ll
    0.37
    'll
    0.35
     yourself
    0.33
     aren
    0.33
     guys
    0.31
    Act Density 0.386%

    No Known Activations