INDEX
    Explanations

    percentages or numerical values

    New Auto-Interp
    Negative Logits
    ndra
    -0.68
    anski
    -0.66
    bed
    -0.61
    fram
    -0.59
     dagger
    -0.59
     sed
    -0.59
    oak
    -0.59
     Hamm
    -0.59
     bait
    -0.58
    yang
    -0.57
    POSITIVE LOGITS
    9999
    1.19
    999
    1.16
    percent
    1.01
    99
    0.94
     percentile
    0.92
    ãĤ¼ãĤ¦ãĤ¹
    0.90
    998
    0.86
    98
    0.85
    %%%%
    0.84
    95
    0.83
    Act Density 0.074%

    No Known Activations