INDEX
    Explanations

    expressions of grumpiness and related negative emotions

    New Auto-Interp
    Negative Logits
    oodles
    -0.15
    enet
    -0.15
    eden
    -0.14
    طار
    -0.14
    orderid
    -0.14
    ernetes
    -0.14
    edral
    -0.14
     trot
    -0.14
    anye
    -0.14
     paran
    -0.14
    POSITIVE LOGITS
     gr
    0.38
    ueling
    0.26
    udging
    0.24
    aces
    0.23
    inning
    0.22
    uff
    0.21
    gr
    0.21
    uel
    0.20
     Gr
    0.19
    istle
    0.19
    Act Density 0.013%

    No Known Activations