INDEX
    Explanations

    requests for clicking a box to verify non-robot status

    prompts or actions related to user interaction on a webpage

    New Auto-Interp
    Negative Logits
    ccording
    -0.72
    venge
    -0.61
     brunt
    -0.56
     mete
    -0.56
     proport
    -0.54
     helicop
    -0.54
     relative
    -0.53
     Constantin
    -0.52
     Anon
    -0.51
     comr
    -0.50
    POSITIVE LOGITS
    assis
    0.79
     PsyNet
    0.70
     Cancel
    0.67
    Asset
    0.67
    ricular
    0.64
     buttons
    0.64
    iframe
    0.63
     Sign
    0.62
     Download
    0.62
    taboola
    0.61
    Act Density 0.005%

    No Known Activations