INDEX
    Explanations

    personal pronouns, particularly variations of "I" and "we."

    New Auto-Interp
    Negative Logits
     surely
    -0.17
     Surely
    -0.17
    Sure
    -0.17
    ADED
    -0.17
    owitz
    -0.17
    capitalize
    -0.15
    å·»
    -0.15
    aded
    -0.15
    sov
    -0.14
     organizers
    -0.14
    POSITIVE LOGITS
     Shall
    0.19
     shall
    0.19
    shall
    0.18
     fancy
    0.17
     tong
    0.17
    елиÑĩ
    0.16
     Vid
    0.15
    itan
    0.15
     flick
    0.15
     spoken
    0.15
    Act Density 0.109%

    No Known Activations