INDEX
    Explanations

    phrases indicating a negative stance or denial

    New Auto-Interp
    Negative Logits
     towed
    -0.78
     crowned
    -0.70
    imir
    -0.66
     sped
    -0.66
    rex
    -0.65
     tossed
    -0.65
     flung
    -0.64
     stabilized
    -0.63
     knocked
    -0.63
     bombed
    -0.62
    POSITIVE LOGITS
    xious
    1.05
    except
    0.90
    oses
    0.88
    obs
    0.86
     excuses
    0.85
    ct
    0.85
     doubt
    0.84
     discern
    0.82
     meaningful
    0.82
    THING
    0.80
    Act Density 0.050%

    No Known Activations