INDEX
    Explanations

    phrases related to affirmations

    affirmations or confirmations within the text

    New Auto-Interp
    Negative Logits
    abal
    -0.77
    èĢħ
    -0.76
    arted
    -0.73
    rance
    -0.70
    ridor
    -0.69
    ounded
    -0.66
    vati
    -0.65
    liner
    -0.64
    Discussion
    -0.64
    gall
    -0.64
    POSITIVE LOGITS
     sir
    0.96
     technically
    0.79
     THERE
    0.77
     yes
    0.74
     please
    0.69
     anecd
    0.67
     yeah
    0.66
     sexism
    0.65
     there
    0.64
     insofar
    0.63
    Act Density 0.050%

    No Known Activations