INDEX
    Explanations

    phrases related to reactions or replies

    phrases indicating a reaction or response to events or questions

    New Auto-Interp
    Negative Logits
     brakes
    -0.70
    flo
    -0.70
     Marin
    -0.68
    \\\\\\\\
    -0.67
    hemat
    -0.65
     Dull
    -0.64
     knots
    -0.64
    oret
    -0.63
    gin
    -0.63
    utters
    -0.62
    POSITIVE LOGITS
     thereto
    0.84
    reply
    0.83
    ively
    0.82
     briefs
    0.79
     response
    0.77
     responses
    0.76
    uberty
    0.75
    response
    0.71
     feedback
    0.68
    naires
    0.67
    Act Density 0.014%

    No Known Activations