INDEX
    Explanations

    instances of contractions, specifically the contraction "won't"

    the expression of negation, particularly focusing on the word "won't."

    New Auto-Interp
    Negative Logits
    gypt
    -0.61
     agg
    -0.60
     practicable
    -0.59
     constrained
    -0.58
     princ
    -0.58
     anat
    -0.58
     bonded
    -0.58
     discontinued
    -0.57
     entertained
    -0.56
     populated
    -0.56
    POSITIVE LOGITS
    't
    1.78
    now
    1.14
    cest
    1.10
    itive
    1.07
    kish
    1.01
    kered
    0.96
    ky
    0.93
    ipeg
    0.93
    ks
    0.91
    etsk
    0.91
    Act Density 0.034%

    No Known Activations