INDEX
    Explanations

    profane and emphatic expressions

    expressions of frustration or strong emphasis

    New Auto-Interp
    Negative Logits
    IDS
    -0.69
    OHN
    -0.68
    DN
    -0.64
    KEN
    -0.63
     Flavoring
    -0.62
     quo
    -0.61
    ynthesis
    -0.60
    KY
    -0.57
    amide
    -0.57
    ":""},{"
    -0.57
    POSITIVE LOGITS
    near
    1.05
    ibly
    1.01
    ation
    0.86
    atio
    0.84
    orse
    0.77
     near
    0.74
    ably
    0.72
    ated
    0.72
     damned
    0.72
    emed
    0.72
    Act Density 0.024%

    No Known Activations