INDEX
    Explanations

    numeric patterns and numerical values such as specific numbers or numerical expressions

    New Auto-Interp
    Negative Logits
    loo
    -0.95
    WARD
    -0.76
    EEE
    -0.73
    REDACTED
    -0.72
    hire
    -0.71
    mosp
    -0.71
    ELF
    -0.71
     Template
    -0.70
    hold
    -0.70
     Directive
    -0.69
    POSITIVE LOGITS
    eral
    1.02
    pty
    0.96
    BER
    0.94
    emonic
    0.92
    ptoms
    0.91
    itionally
    0.90
    phys
    0.90
    locked
    0.89
    ming
    0.88
     num
    0.84
    Act Density 1.153%

    No Known Activations