INDEX
    Explanations

    words related to negative or derogatory descriptions of people or actions

    terms related to deceitful or manipulative behavior

    New Auto-Interp
    Negative Logits
     neoc
    -0.75
     lapse
    -0.70
     Ramadan
    -0.66
     craving
    -0.66
     famine
    -0.65
    theless
    -0.64
     millennium
    -0.64
     foremost
    -0.63
     abundantly
    -0.62
     inaug
    -0.62
    POSITIVE LOGITS
    cheon
    0.95
    hett
    0.81
    udo
    0.79
    berus
    0.77
    itzer
    0.76
    anut
    0.75
    uli
    0.75
    anon
    0.74
    oslav
    0.71
    arios
    0.70
    Act Density 0.079%

    No Known Activations