INDEX
    Explanations

    phrases suggesting user engagement or conversational prompts

    Followed by first-person pronouns

    New Auto-Interp
    Negative Logits
     comment
    -0.50
     writer
    -0.50
    iredo
    -0.50
     Comment
    -0.49
    timewa
    -0.49
     anon
    -0.49
    .“
    -0.48
     Anon
    -0.48
    “...
    -0.47
    ="@
    -0.47
    POSITIVE LOGITS
     виправивши
    0.74
    ________________
    0.72
    OGND
    0.71
    Attached
    0.68
     Normdatei
    0.63
     Gonna
    0.59
     Gotta
    0.59
    Enviado
    0.58
     defStyleAttr
    0.58
    Добавлено
    0.57
    Act Density 0.074%

    No Known Activations