INDEX
    Explanations

    phrases that indicate user engagement or interaction, particularly commenting

    New Auto-Interp
    Negative Logits
    irc
    -0.15
    reek
    -0.15
    lene
    -0.15
    aurus
    -0.15
    ayer
    -0.15
    iban
    -0.14
    inkel
    -0.14
    -League
    -0.14
    ercise
    -0.14
     elves
    -0.14
    POSITIVE LOGITS
    CriticalSection
    0.20
    hart
    0.19
    697
    0.17
    hou
    0.17
    inka
    0.17
    Leave
    0.16
     Leave
    0.16
    enan
    0.16
    ayette
    0.16
    ãĥ¬ãĥĥãĥĪ
    0.16
    Act Density 0.020%

    No Known Activations