INDEX
    Explanations

    greetings and welcoming phrases

    New Auto-Interp
    Negative Logits
    ħ
    -1.74
    ¯
    -1.67
    ŀ
    -1.63
    Ļ
    -1.58
    ĥ
    -1.57
    ¾
    -1.53
    ĸ
    -1.52
    Ŀ
    -1.51
     &=
    -1.50
    ģ
    -1.49
    POSITIVE LOGITS
     chat
    1.82
     congrat
    1.80
     yours
    1.78
     welcome
    1.68
     joking
    1.66
     Welcome
    1.61
     my
    1.59
     questions
    1.57
     comments
    1.56
     hello
    1.55
    Act Density 1.609%

    No Known Activations