INDEX
    Explanations

    phrases containing the word "want" with various levels of intensity

    expressions of desire and intention

    New Auto-Interp
    Negative Logits
    NVIDIA
    -0.62
    ilings
    -0.61
    livious
    -0.58
    ulty
    -0.58
    illian
    -0.57
     obser
    -0.57
    ielding
    -0.57
    icol
    -0.56
    hesis
    -0.55
    usting
    -0.55
    POSITIVE LOGITS
     to
    1.10
     revenge
    1.03
     nothing
    0.90
    only
    0.87
     vengeance
    0.86
     answers
    0.80
     permission
    0.75
    to
    0.75
    reprene
    0.75
     something
    0.73
    Act Density 0.111%

    No Known Activations