INDEX
    Explanations

    words related to negative opinions or criticism

    negative descriptors, particularly related to the term "horrible."

    New Auto-Interp
    Negative Logits
     Southwest
    -0.78
    RIC
    -0.67
    camp
    -0.66
     Qian
    -0.65
     Northwest
    -0.64
    ista
    -0.64
     pillow
    -0.62
     scholarship
    -0.62
    north
    -0.60
    -0.60
    POSITIVE LOGITS
    kefeller
    0.87
    terday
    0.81
    rible
    0.78
    theless
    0.77
    edom
    0.75
    xon
    0.75
    --+
    0.73
    arten
    0.72
    bley
    0.72
     Gaal
    0.70
    Act Density 0.034%

    No Known Activations