INDEX
    Explanations

    instances of the phrase "can't help" followed by verbs or pronouns indicating involuntary actions or emotions, such as "feeling", "smile", "notice", and "staring"

    instances of the word "help" and its variations

    New Auto-Interp
    Negative Logits
    Sov
    -0.73
    Safe
    -0.70
    lore
    -0.70
    Home
    -0.67
    oven
    -0.66
    âĢ¢âĢ¢
    -0.64
    LAN
    -0.64
    LOS
    -0.62
    ledged
    -0.61
    yz
    -0.60
    POSITIVE LOGITS
     noticing
    1.07
     wondering
    0.90
     feeling
    0.89
     grinning
    0.86
     but
    0.85
     smiling
    0.82
     laughing
    0.81
     imagining
    0.80
     slipping
    0.78
    but
    0.76
    Act Density 0.026%

    No Known Activations