INDEX
    Explanations

    answers or responses within a sentence

    assertions or statements that present information or answers

    New Auto-Interp
    Negative Logits
    ership
    -0.76
    rongh
    -0.68
    ombat
    -0.68
    idi
    -0.66
     Cutting
    -0.66
     Defenders
    -0.64
    roying
    -0.63
    ivities
    -0.62
    Keefe
    -0.62
     Samar
    -0.62
    POSITIVE LOGITS
     YES
    0.99
     yes
    0.94
    answer
    0.85
     affirmative
    0.81
    YES
    0.80
    yes
    0.79
     answer
    0.78
    QUI
    0.77
     answ
    0.74
     answers
    0.73
    Act Density 0.141%

    No Known Activations