INDEX
    Explanations

    words expressing feelings of disdain, scorn, or contempt

    New Auto-Interp
    Negative Logits
     Bram
    -0.15
    커
    -0.15
    ضÛĮ
    -0.14
    746
    -0.14
     Kern
    -0.14
    _IGNORE
    -0.14
     Penn
    -0.14
    queeze
    -0.14
    endance
    -0.13
    åŃIJãģ¯
    -0.13
    POSITIVE LOGITS
    ible
    0.16
    sure
    0.16
    anka
    0.16
    ky
    0.15
    agini
    0.15
    agas
    0.15
    kle
    0.15
    akash
    0.14
    ünchen
    0.14
    mann
    0.14
    Act Density 0.011%

    No Known Activations