INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    	body
    -0.07
     лок
    -0.07
     favoured
    -0.06
     CONF
    -0.06
     شخصیت
    -0.06
    avored
    -0.06
     analyzed
    -0.06
    aryawan
    -0.06
    OLUMN
    -0.06
    -0.06
    POSITIVE LOGITS
    (option
    0.07
    _RECEIVED
    0.06
     Aub
    0.06
    Rooms
    0.06
    لط
    0.06
     solic
    0.06
    AssignableFrom
    0.06
     ubytování
    0.06
     Clara
    0.06
     Scaling
    0.05
    Act Density 0.005%

    No Known Activations