INDEX
    Explanations

    indicators of audience engagement and self-identification

    New Auto-Interp
    Negative Logits
    uez
    -0.17
     structural
    -0.16
     Structural
    -0.16
    igsaw
    -0.15
    ensch
    -0.15
    addy
    -0.15
    Ñī
    -0.15
    ertz
    -0.15
    rov
    -0.14
    uala
    -0.14
    POSITIVE LOGITS
     fur
    0.18
     jste
    0.16
    etty
    0.16
     yourself
    0.15
    fur
    0.15
    .echo
    0.14
     hopefully
    0.14
     yourselves
    0.14
    piel
    0.14
    ãģ¾ãģł
    0.14
    Act Density 0.124%

    No Known Activations