INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     nett
    -0.06
     alloy
    -0.06
    ulation
    -0.06
     owed
    -0.06
    کیل
    -0.06
    	label
    -0.06
    Inter
    -0.06
     owes
    -0.06
    ुव
    -0.06
     بگیر
    -0.06
    POSITIVE LOGITS
    YSQL
    0.07
     sodom
    0.06
    MapView
    0.06
    AY
    0.06
     PW
    0.06
    WORK
    0.06
    Charsets
    0.06
     signaling
    0.06
     SORT
    0.06
    0.06
    Act Density 0.007%

    No Known Activations