INDEX
    Explanations

    references to bugs in software or systems

    New Auto-Interp
    Negative Logits
    ACP
    -0.87
    yss
    -0.83
    uclear
    -0.75
    ometown
    -0.71
    NAS
    -0.71
    amina
    -0.70
    ining
    -0.69
    ager
    -0.67
    minist
    -0.66
    ographic
    -0.66
    POSITIVE LOGITS
     Bunny
    1.03
    hooting
    0.95
     bugs
    0.94
     patched
    0.92
    bugs
    0.91
    pots
    0.89
     Bugs
    0.86
    afety
    0.84
     glitches
    0.82
    lash
    0.79
    Act Density 0.013%

    No Known Activations