INDEX
    Explanations

    specific numerical values and related qualifiers in text

    New Auto-Interp
    Negative Logits
    yms
    -0.16
    åħ
    -0.16
    forman
    -0.15
    [port
    -0.15
    agged
    -0.14
     disarm
    -0.14
    uben
    -0.14
    İY
    -0.14
    pty
    -0.14
    ãĥ«ãĤ¯
    -0.14
    POSITIVE LOGITS
     dick
    0.18
    çek
    0.17
    ваÑĢ
    0.15
     lip
    0.15
     Lip
    0.15
    blade
    0.15
    belt
    0.15
    bur
    0.15
     Brick
    0.15
     strip
    0.14
    Act Density 0.025%

    No Known Activations