INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     offended
    -0.07
    روب
    -0.06
     услуг
    -0.06
    وث
    -0.06
    окси
    -0.06
     νεφώσεις
    -0.06
    ین
    -0.06
    وه
    -0.06
    iam
    -0.06
    аг
    -0.06
    POSITIVE LOGITS
    	hr
    0.07
    Administration
    0.07
    unders
    0.06
    (shape
    0.06
     dispenser
    0.06
    	link
    0.06
     chaining
    0.06
    held
    0.06
     physiology
    0.06
    _uploaded
    0.06
    Act Density 0.001%

    No Known Activations