INDEX
    Explanations

    phrases related to giving advice or instructions

    phrases indicating suggestions or recommendations

    New Auto-Interp
    Negative Logits
     argues
    -0.84
     acknowledges
    -0.67
     contends
    -0.67
     advocates
    -0.65
     asserts
    -0.65
     cite
    -0.64
     ensures
    -0.63
    ths
    -0.63
     Polit
    -0.62
     cites
    -0.62
    POSITIVE LOGITS
    apest
    0.77
    orage
    0.71
    —"
    0.70
    â̦"
    0.70
     â̦"
    0.69
    â̦."
    0.66
     mosqu
    0.64
     conflic
    0.63
     fitt
    0.62
     prank
    0.62
    Act Density 1.637%

    No Known Activations