INDEX
    Explanations

    positive descriptive adjectives

    terms associated with niceness or positive attributes

    New Auto-Interp
    Negative Logits
    WIND
    -0.96
    AUT
    -0.82
    ENG
    -0.74
    Ultra
    -0.73
     Printed
    -0.70
    produced
    -0.69
    âĵĺ
    -0.68
    GC
    -0.67
     Mandatory
    -0.66
     Continued
    -0.66
    POSITIVE LOGITS
     nic
    1.46
    eties
    1.38
    uity
    0.95
    uously
    0.94
    atural
    0.94
    esse
    0.93
    otin
    0.91
    eteenth
    0.89
    ciating
    0.89
    ety
    0.88
    Act Density 0.005%

    No Known Activations