INDEX
    Explanations

    emotional language and expressions related to praise or criticism

    New Auto-Interp
    Negative Logits
     è£ıè
    -0.74
    uyomi
    -0.69
    ACP
    -0.67
    thodox
    -0.66
    ãĥ¼ãĥĨ
    -0.65
    é¾įå¥ij士
    -0.65
    Struct
    -0.62
    Eastern
    -0.60
    PF
    -0.59
    gd
    -0.59
    POSITIVE LOGITS
     yourselves
    1.59
     yourself
    1.20
    Tube
    0.86
     your
    0.84
    your
    0.81
     YOUR
    0.75
     Yourself
    0.75
     majesty
    0.73
     cunt
    0.73
     sir
    0.72
    Act Density 0.284%

    No Known Activations