INDEX
    Explanations

    words related to arguments or debates

    references to logical or rhetorical arguments

    New Auto-Interp
    Negative Logits
     Carbuncle
    -0.69
    ISTER
    -0.63
    idays
    -0.62
     Atomic
    -0.62
    lights
    -0.62
     PHOTO
    -0.61
    ookie
    -0.60
     unmarked
    -0.60
    aches
    -0.60
    ardy
    -0.59
    POSITIVE LOGITS
    ative
    1.30
    uments
    1.14
     against
    1.05
    abl
    1.03
    ument
    0.99
    ation
    0.92
    ator
    0.91
     arguments
    0.88
    atives
    0.88
    atively
    0.85
    Act Density 0.040%

    No Known Activations