INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    AAC
    -0.07
    _chance
    -0.07
    _publish
    -0.07
    تماع
    -0.07
    під
    -0.07
    PHPUnit
    -0.07
    .Head
    -0.06
     StatusCode
    -0.06
     cooper
    -0.06
     σύ
    -0.06
    POSITIVE LOGITS
     introduced
    0.09
     introduce
    0.09
     introduces
    0.08
     introducing
    0.07
     Not
    0.06
    	        
    0.06
     cri
    0.06
     membership
    0.06
     اس
    0.06
     SO
    0.06
    Act Density 0.018%

    No Known Activations