INDEX
    Explanations

    instances of the greeting "Hello"

    New Auto-Interp
    Negative Logits
     partisans
    -0.78
    utton
    -0.77
     reserv
    -0.71
    udic
    -0.70
     adjud
    -0.65
     grounds
    -0.65
    agency
    -0.65
     Dug
    -0.65
     exped
    -0.65
     dispos
    -0.65
    POSITIVE LOGITS
    Hello
    3.58
     Hello
    3.45
     hello
    2.81
    hello
    2.81
    Hi
    1.96
     Hi
    1.73
     Goodbye
    1.71
    reetings
    1.68
     greeting
    1.45
    Dear
    1.41
    Act Density 0.018%

    No Known Activations