INDEX
    Explanations

    negations and the word "not."

    New Auto-Interp
    Negative Logits
    
    -0.73
     للمعارف
    -0.68
     autorytatywna
    -0.58
     مرئيه
    -0.57
    BagConstraints
    -0.56
    PerformLayout
    -0.56
     للاسماء
    -0.56
    ScopeManager
    -0.55
    postIndex
    -0.55
     estekak
    -0.55
    POSITIVE LOGITS
     NOT
    1.63
    NOT
    1.45
     Not
    0.78
    Not
    0.68
    not
    0.64
     NO
    0.62
     NON
    0.57
     НЕ
    0.56
    ENOT
    0.54
     NEVER
    0.50
    Act Density 0.009%

    No Known Activations