INDEX
    Explanations

    abbreviations or acronyms with a numerical value in it

    New Auto-Interp
    Negative Logits
    éĹĺ
    -0.79
     ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
    -0.72
    ãĤ®
    -0.69
    ãĥ¯ãĥ³
    -0.68
    ãĥ¼ãĥĨãĤ£
    -0.67
    OLOGY
    -0.67
    ãĥ¼ãĤ¯
    -0.66
    hower
    -0.61
    é¾įå¥ij士
    -0.59
     Primal
    -0.59
    POSITIVE LOGITS
    adders
    1.26
    ugs
    1.14
    ibr
    1.11
    ibrarian
    1.09
    idd
    1.09
    ipp
    1.09
    ashes
    1.07
    ips
    1.06
    agging
    1.05
    ongh
    1.03
    Act Density 8.536%

    No Known Activations