INDEX
    Explanations

    phrases indicating user engagement with content, particularly "read more" prompts

    New Auto-Interp
    Negative Logits
    /tos
    -0.14
    oro
    -0.14
    igid
    -0.14
    acea
    -0.14
    orns
    -0.14
    ches
    -0.14
     packing
    -0.14
    olumn
    -0.14
    web
    -0.13
    orn
    -0.13
    POSITIVE LOGITS
     Gamb
    0.16
    fad
    0.15
    ormsg
    0.14
    ELLOW
    0.14
    .sheet
    0.14
    šli
    0.14
    ÏįÏĢ
    0.14
     Zuk
    0.13
    ERV
    0.13
    mpar
    0.13
    Act Density 0.028%

    No Known Activations