INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     relig
    -0.10
     Manila
    -0.09
     Unido
    -0.09
     religious
    -0.08
     عشرة
    -0.08
     مسجد
    -0.08
     remise
    -0.08
     Aviv
    -0.08
     сервис
    -0.08
     smoking
    -0.08
    POSITIVE LOGITS
     quadratic
    0.12
     inequalities
    0.11
     negativity
    0.09
     polynomial
    0.09
     negat
    0.09
     PSD
    0.09
     positivity
    0.09
     infinity
    0.08
     inequality
    0.08
     quart
    0.08
    Act Density 0.038%

    No Known Activations