INDEX
    Explanations

    greetings or salutations at the beginning of a text

    instances of the word "Hello" in various contexts

    New Auto-Interp
    Negative Logits
    arian
    -0.84
    aic
    -0.80
    nutrition
    -0.79
    rovers
    -0.75
    prem
    -0.74
    uing
    -0.73
    ipl
    -0.73
    ror
    -0.73
    eele
    -0.72
    acle
    -0.72
    POSITIVE LOGITS
     Kitty
    1.09
     hello
    0.87
     Hello
    0.82
    !.
    0.77
     dear
    0.73
    !,
    0.72
     Goodbye
    0.70
    !
    0.69
    Hello
    0.69
     guys
    0.69
    Act Density 0.008%

    No Known Activations