INDEX
    Explanations

    words related to negative or unpleasant situations

    the prefix "uns," indicating negation or absence

    New Auto-Interp
    Negative Logits
    SHIP
    -0.75
     Grayson
    -0.70
    stanbul
    -0.68
    zzo
    -0.68
     Mercury
    -0.67
    OPLE
    -0.67
    */(
    -0.66
    tsky
    -0.66
     Guardians
    -0.66
    anwhile
    -0.63
    POSITIVE LOGITS
     uns
    0.92
    avour
    0.90
    rep
    0.89
    apon
    0.81
    oci
    0.80
    uitive
    0.80
    conv
    0.79
    heat
    0.79
     concess
    0.78
    alted
    0.76
    Act Density 0.006%

    No Known Activations