INDEX
    Explanations

    responses to questions or statements

    instances of dialogue, specifically replies or responses in conversations

    New Auto-Interp
    Negative Logits
    teenth
    -0.75
    mental
    -0.72
    fi
    -0.69
    ctors
    -0.66
    BALL
    -0.66
    icipated
    -0.66
    dar
    -0.64
    flame
    -0.64
     Trials
    -0.63
    cipled
    -0.63
    POSITIVE LOGITS
     thereto
    0.98
     angrily
    0.85
     favorably
    0.85
     sarcast
    0.84
     affirm
    0.84
     promptly
    0.80
     politely
    0.79
    reply
    0.78
     enthusiastically
    0.77
    later
    0.75
    Act Density 0.038%

    No Known Activations