INDEX
    Explanations

    negation or disjunction

    New Auto-Interp
    Negative Logits
    ponential
    -0.07
    .OK
    -0.07
     briefing
    -0.07
    ijken
    -0.07
    /Getty
    -0.07
    erap
    -0.07
    اورپوینت
    -0.06
     mots
    -0.06
    ίτ
    -0.06
    ,他们
    -0.06
    POSITIVE LOGITS
     comfortably
    0.07
     affair
    0.07
     систем
    0.07
    pies
    0.06
     faded
    0.06
    Sale
    0.06
     صنعت
    0.06
    	load
    0.06
    dummy
    0.06
     someday
    0.06
    Act Density 0.027%

    No Known Activations