INDEX
    Explanations

    references to the word "you" and its related forms, indicating a focus on direct address or interaction

    New Auto-Interp
    Negative Logits
     betweenstory
    -0.86
    styleType
    -0.78
    IsMutable
    -0.76
    -0.72
     ddelweddau
    -0.71
    uxxxx
    -0.64
     يتيمه
    -0.64
    ConstraintMaker
    -0.64
    WriteBarrier
    -0.63
    ()?;
    -0.63
    POSITIVE LOGITS
     you
    0.62
     YOU
    0.57
    you
    0.55
     yourselves
    0.54
    addContainerGap
    0.52
     Aérea
    0.52
    اء
    0.51
    crapers
    0.50
     flattery
    0.49
     yours
    0.48
    Act Density 0.202%

    No Known Activations