INDEX
    Explanations

    keywords related to logical reasoning and persuasive language

    occurrences of the word "arguments."

    New Auto-Interp
    Negative Logits
    orporated
    -0.67
    onet
    -0.66
    fecture
    -0.66
     behold
    -0.63
    covered
    -0.63
    ifter
    -0.62
     Merrill
    -0.62
    ishable
    -0.62
    cko
    -0.60
    iph
    -0.59
    POSITIVE LOGITS
     arguments
    3.72
     argument
    2.66
    argument
    2.23
     Argument
    2.10
    Arg
    1.71
     objections
    1.66
     debates
    1.58
     assertions
    1.54
     Arg
    1.52
     arguing
    1.49
    Act Density 0.015%

    No Known Activations