INDEX
    Explanations

    greetings or introductory phrases

    occurrences of the phrase "Hello."

    New Auto-Interp
    Negative Logits
    aic
    -0.91
    ucket
    -0.84
    rent
    -0.83
    eele
    -0.76
    ifiable
    -0.75
    hip
    -0.74
    arian
    -0.71
    arians
    -0.67
    nutrition
    -0.66
    abase
    -0.66
    POSITIVE LOGITS
     Kitty
    1.23
     Neighbor
    0.96
     Goodbye
    0.95
     hello
    0.85
     Bye
    0.82
     bye
    0.82
    !.
    0.82
     Hello
    0.80
    !
    0.77
     Again
    0.75
    Act Density 0.021%

    No Known Activations