INDEX
    Explanations

    negative numbers and subtraction

    New Auto-Interp
    Negative Logits
    ུ་
    -1.00
     *=
    -0.92
     Multiply
    -0.82
     multiply
    -0.78
    -0.76
     hilsen
    -0.75
    Multiply
    -0.74
    wisata
    -0.73
    multiply
    -0.73
     چشم
    -0.72
    POSITIVE LOGITS
     negative
    1.65
     minus
    1.45
    negative
    1.38
    Negative
    1.37
    minus
    1.23
    Minus
    1.17
     Negative
    1.12
     negativo
    1.05
     subtraction
    1.05
     negativos
    1.05
    Act Density 0.045%

    No Known Activations