INDEX
    Explanations

    the presence of the word "help"

    the phrase "can't help" in various contexts

    New Auto-Interp
    Negative Logits
    theless
    -0.66
    etheus
    -0.64
    andom
    -0.59
    avior
    -0.58
    sonian
    -0.58
    naire
    -0.56
    punk
    -0.54
    esome
    -0.54
    initely
    -0.54
    ortality
    -0.54
    POSITIVE LOGITS
    ctor
    0.66
     noticing
    0.64
    ãĤ®
    0.61
    ":["
    0.60
     Sega
    0.59
    des
    0.58
    ãĤ¨ãĥ«
    0.58
     fielding
    0.56
    #$#$
    0.55
     sponsoring
    0.55
    Act Density 0.033%

    No Known Activations