INDEX
    Explanations

    occurrences of the word "Hello"

    the occurrence of the phrase "Hello" in various contexts

    New Auto-Interp
    Negative Logits
    arian
    -0.87
    aic
    -0.81
    hip
    -0.79
    ifiable
    -0.79
    eele
    -0.78
    pite
    -0.78
    uing
    -0.78
    arians
    -0.77
    nutrition
    -0.75
    rovers
    -0.74
    POSITIVE LOGITS
     Kitty
    1.01
     hello
    0.84
     Neighbor
    0.83
    !.
    0.78
     Hello
    0.73
    !,
    0.73
     Goodbye
    0.71
     WORLD
    0.70
    !".
    0.70
    ãĥ¼ãĥ«
    0.69
    Act Density 0.015%

    No Known Activations