INDEX
    Explanations

    negations and their accompanying phrases

    New Auto-Interp
    Negative Logits
    avond
    -0.64
    +#+#
    -0.63
    حياتها
    -0.57
    iastes
    -0.52
    Demografia
    -0.51
     resourceCulture
    -0.50
    -0.50
    DockStyle
    -0.49
    حياته
    -0.47
    exitRule
    -0.47
    POSITIVE LOGITS
     mention
    1.07
     Mention
    0.87
     mentioning
    0.79
    mention
    0.76
    ISupport
    0.72
     mentions
    0.72
    Mention
    0.70
     forget
    0.69
     forgetting
    0.65
     виправивши
    0.61
    Act Density 0.161%

    No Known Activations