INDEX
    Explanations

    numbers written as words

    patterns related to numerical values

    New Auto-Interp
    Negative Logits
    loo
    -0.80
    REAM
    -0.73
    WARD
    -0.70
    ffee
    -0.70
    hips
    -0.70
    Accessory
    -0.66
    hire
    -0.64
     Denis
    -0.63
    OHN
    -0.63
    RAFT
    -0.63
    POSITIVE LOGITS
    eral
    0.98
    emonic
    0.94
     num
    0.91
    pty
    0.88
    quist
    0.87
     Num
    0.87
    phys
    0.84
    BER
    0.77
    num
    0.75
    iatures
    0.74
    Act Density 0.023%

    No Known Activations