INDEX
    Explanations

    email subscription-related text and prompts

    New Auto-Interp
    Negative Logits
    ested
    -0.68
    footed
    -0.65
    esan
    -0.62
     Dwar
    -0.61
    cule
    -0.60
    ried
    -0.59
    rans
    -0.58
    utenant
    -0.57
     Ames
    -0.56
    dain
    -0.56
    POSITIVE LOGITS
    scribe
    0.75
    Interstitial
    0.74
    taboola
    0.70
    CHAT
    0.70
     unsub
    0.70
     subscribe
    0.67
    ulate
    0.66
    ences
    0.65
     uncond
    0.65
    iatus
    0.65
    Act Density 3.700%

    No Known Activations