INDEX
    Explanations

    words related to negative connotations or harmful attributes

    terms associated with negative outcomes or harmful effects

    New Auto-Interp
    Negative Logits
     Carbuncle
    -0.76
    æĸ¹
    -0.75
     Polo
    -0.72
    BOOK
    -0.72
     Annotations
    -0.72
    ALK
    -0.71
    externalActionCode
    -0.70
    ãĥ¼ãĥĨãĤ£
    -0.69
    uyomi
    -0.68
     Defenders
    -0.67
    POSITIVE LOGITS
    colm
    1.10
    adies
    1.10
    ignant
    1.06
    formed
    1.03
    arial
    0.96
    igned
    0.93
    practice
    0.93
    ady
    0.88
    icious
    0.86
     mal
    0.85
    Act Density 0.015%

    No Known Activations