INDEX
    Explanations

    negations and comparative phrases indicating contrast

    "not" followed by a negative concept

    New Auto-Interp
    Negative Logits
     NgModule
    -0.53
     dourada
    -0.48
    achau
    -0.48
    L
    -0.45
    B
    -0.43
     Legendre
    -0.43
    UrlResolution
    -0.43
    A
    -0.42
     feminina
    -0.41
    不等
    -0.41
    POSITIVE LOGITS
     necessarily
    1.00
     merely
    0.93
     مشين
    0.92
     necesariamente
    0.91
     just
    0.90
    necessarily
    0.89
    NameInMap
    0.89
    жели
    0.88
     дописавши
    0.86
    それとも
    0.85
    Act Density 0.215%

    No Known Activations