INDEX
    Explanations

    words related to providing evidence or support for a claim

    terms related to substantiation and refutation in arguments

    New Auto-Interp
    Negative Logits
     Mistress
    -0.70
     Drawn
    -0.69
    killer
    -0.67
    eries
    -0.67
    fork
    -0.62
    izabeth
    -0.62
    helm
    -0.61
    >>>>>>>>
    -0.61
    crow
    -0.61
    nesday
    -0.61
    POSITIVE LOGITS
    acles
    0.99
    ivity
    0.96
    acle
    0.93
    oret
    0.92
    ctr
    0.88
    iveness
    0.87
    race
    0.87
    iating
    0.87
    raint
    0.84
    iation
    0.84
    Act Density 0.016%

    No Known Activations