INDEX
    Explanations

    phrases related to strong and emphatic statements, often of disapproval

    instances of the word "outright" and its context related to strong assertions or definitive statements

    New Auto-Interp
    Negative Logits
    agine
    -0.89
    arts
    -0.80
    ulton
    -0.77
    ĺħ
    -0.76
    nan
    -0.74
    ĸļ
    -0.72
    anwhile
    -0.72
    gerald
    -0.70
    utes
    -0.69
     wisely
    -0.69
    POSITIVE LOGITS
     hostility
    0.84
     racism
    0.77
     contradiction
    0.72
     refusal
    0.72
     malice
    0.71
     disregard
    0.70
    itarian
    0.69
     theft
    0.69
     ban
    0.68
     guiActiveUn
    0.67
    Act Density 0.026%

    No Known Activations