INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     facile
    -0.07
    playing
    -0.07
     jacket
    -0.07
    izioni
    -0.06
     Sally
    -0.06
    Minutes
    -0.06
    ersions
    -0.06
    	C
    -0.06
    이드
    -0.06
    (Node
    -0.06
    POSITIVE LOGITS
     گفت
    0.07
     Courtney
    0.06
     till
    0.06
     คำ
    0.06
     ben
    0.06
    .EMAIL
    0.06
     deix
    0.06
    990
    0.06
     अपर
    0.06
     امور
    0.06
    Act Density 0.002%

    No Known Activations