INDEX
    Explanations

    quotes or strong statements regarding controversial claims and denials

    New Auto-Interp
    Negative Logits
    itud
    -0.15
    lag
    -0.15
    zee
    -0.14
     cynical
    -0.14
     escal
    -0.14
    _extraction
    -0.13
    air
    -0.13
     Bair
    -0.13
    aily
    -0.13
    oose
    -0.13
    POSITIVE LOGITS
    icros
    0.19
    reject
    0.18
     reject
    0.17
     waste
    0.16
     Reject
    0.15
    .reject
    0.15
     rejects
    0.15
    ibold
    0.15
    éϤ
    0.15
    <src
    0.15
    Act Density 0.251%

    No Known Activations