INDEX
    Explanations

    contractions with the word "not"

    negations or contractions

    New Auto-Interp
    Negative Logits
    CI
    -0.68
    rant
    -0.64
    senal
    -0.64
     è£ıè
    -0.64
    estate
    -0.63
     referen
    -0.63
    Person
    -0.62
    ItemImage
    -0.62
     successors
    -0.61
    å½
    -0.61
    POSITIVE LOGITS
     afford
    1.40
     imagine
    1.01
     seem
    0.95
     rely
    0.93
     wait
    0.91
     really
    0.90
     ignore
    0.89
     handle
    0.89
     help
    0.87
     bluff
    0.86
    Act Density 0.042%

    No Known Activations