INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ophysical
    -0.07
    attrs
    -0.07
    trust
    -0.07
     penalty
    -0.07
     نوش
    -0.06
    Coord
    -0.06
    -0.06
    drag
    -0.06
    उन
    -0.06
     Leone
    -0.06
    POSITIVE LOGITS
     выдел
    0.07
     yiy
    0.07
    ”,
    0.06
    _url
    0.06
     наблюд
    0.06
    0.06
    	End
    0.06
     naï
    0.06
     Перв
    0.06
    _Cancel
    0.06
    Act Density 0.047%

    No Known Activations