INDEX
    Explanations

    phrases that express negation or dismissal, particularly using the term "never."

    New Auto-Interp
    Negative Logits
    aises
    -0.14
    ught
    -0.14
    enticator
    -0.14
    sj
    -0.13
    arty
    -0.13
    umph
    -0.13
    óÅĤ
    -0.13
    avatars
    -0.13
    istical
    -0.13
    ROP
    -0.13
    POSITIVE LOGITS
    mind
    0.30
     mind
    0.27
    winter
    0.24
     underestimate
    0.23
    ending
    0.23
    land
    0.22
     Mind
    0.21
    Ending
    0.21
     trust
    0.21
    endum
    0.20
    Act Density 0.020%

    No Known Activations