INDEX
    Explanations

    words related to negation

    repetitive phrases punctuated with commas or specific qualifiers

    New Auto-Interp
    Negative Logits
    ounded
    -0.67
    gypt
    -0.64
    ende
    -0.62
    alone
    -0.60
     Equip
    -0.59
    oufl
    -0.58
    mud
    -0.57
     feasibility
    -0.56
     chains
    -0.56
     Klux
    -0.56
    POSITIVE LOGITS
     sir
    0.84
     whatsoever
    0.80
    onsense
    0.79
     nor
    0.73
     thank
    0.68
     except
    0.65
    æĪ
    0.61
     Answer
    0.61
    Shift
    0.60
     exaggeration
    0.60
    Act Density 0.053%

    No Known Activations