INDEX
    Explanations

    inequalities and integers

    New Auto-Interp
    Negative Logits
    -0.08
    比例
    -0.07
     propor
    -0.07
     플레이
    -0.07
    -0.07
     proportions
    -0.07
     이렇게
    -0.07
     μαζί
    -0.07
    ='$
    -0.07
     avt
    -0.07
    POSITIVE LOGITS
    onavir
    0.08
    -oper
    0.08
    &i
    0.08
     barriers
    0.08
     Operational
    0.08
    čnost
    0.08
     operational
    0.08
    čnega
    0.08
     القسم
    0.08
    цин
    0.07
    Act Density 0.026%

    No Known Activations