INDEX
    Explanations

    phrases related to making arguments or claims

    arguments and claims presented in a discussion

    New Auto-Interp
    Negative Logits
    FORMATION
    -0.67
    PER
    -0.66
    Charges
    -0.62
     Brooks
    -0.62
    DER
    -0.59
    TPS
    -0.59
    KB
    -0.57
     offenses
    -0.57
    FORM
    -0.57
     rows
    -0.57
    POSITIVE LOGITS
    uably
    1.34
    uments
    1.08
    emouth
    1.08
    roup
    1.07
    rave
    1.07
    entin
    1.03
    raph
    1.02
    allery
    1.00
    arin
    0.99
    regate
    0.97
    Act Density 0.034%

    No Known Activations