INDEX
    Explanations

    the word "wanna" at various activation levels

    expressions of desire or intention to take action

    New Auto-Interp
    Negative Logits
    士
    -0.86
    VERTISEMENT
    -0.84
    advertisement
    -0.83
    arian
    -0.76
    loo
    -0.74
    lain
    -0.72
    idem
    -0.71
    sequ
    -0.70
    edience
    -0.69
    ochond
    -0.68
    POSITIVE LOGITS
     wanna
    1.29
    ignt
    0.80
     nab
    0.75
    pping
    0.74
    pload
    0.71
     aspire
    0.70
     listen
    0.70
     hear
    0.69
    reprene
    0.69
     ya
    0.68
    Act Density 0.005%

    No Known Activations