INDEX
    Explanations

    mathematical expressions and comparisons

    New Auto-Interp
    Negative Logits
    phazard
    -0.45
     tambahan
    -0.43
    <?>>
    -0.42
    +#+#
    -0.42
     Pim
    -0.40
    zonder
    -0.40
    nhs
    -0.40
     Zon
    -0.38
     Ruh
    -0.36
    cession
    -0.36
    POSITIVE LOGITS
     None
    1.34
     none
    1.30
    None
    1.23
    none
    1.11
     ninguno
    1.03
     NONE
    1.03
     neither
    0.96
    neither
    0.90
    Neither
    0.89
     Neither
    0.87
    Act Density 0.423%

    No Known Activations