INDEX
    Explanations

    permissions and rules

    New Auto-Interp
    Negative Logits
     대비
    -0.10
    -0.09
    ुए
    -0.08
    andt
    -0.08
    ғаш
    -0.08
     מיל
    -0.08
     מנ
    -0.08
     گیری
    -0.08
    ುತ್ತಿದೆ
    -0.08
    _avg
    -0.08
    POSITIVE LOGITS
     permiss
    0.12
     permitido
    0.12
     unrestricted
    0.12
     permission
    0.11
     allowed
    0.11
     permitted
    0.10
     Permit
    0.10
     erlaubt
    0.09
     limitée
    0.09
     अनुमति
    0.09
    Act Density 0.132%

    No Known Activations