INDEX
    Explanations

    conversational markers and engagement cues

    New Auto-Interp
    Negative Logits
    ipl
    -0.16
     manip
    -0.16
    uct
    -0.15
    ona
    -0.14
    enate
    -0.14
    este
    -0.14
    ãĥĨãĥ«
    -0.14
     sooner
    -0.14
    mue
    -0.14
    ote
    -0.14
    POSITIVE LOGITS
    udiantes
    0.17
    便
    0.16
    èħ
    0.16
     Brill
    0.15
    villa
    0.14
    äd
    0.14
    Wars
    0.14
    ÑĨа
    0.14
     Kelvin
    0.14
     backpage
    0.13
    Act Density 0.006%

    No Known Activations