INDEX
    Explanations

    and highlight instances of the word 'delete' along with related actions

    New Auto-Interp
    Negative Logits
    annis
    -0.99
    orsi
    -0.77
    acs
    -0.76
     negotiators
    -0.73
    verning
    -0.71
    soType
    -0.71
    Building
    -0.70
    enegger
    -0.70
    POL
    -0.69
    gio
    -0.69
    POSITIVE LOGITS
     Delete
    0.99
     delete
    0.82
     delet
    0.81
     deleted
    0.80
    itor
    0.73
    leted
    0.72
    abytes
    0.69
     scrolls
    0.68
    aneous
    0.67
     unnecessary
    0.66
    Act Density 5.079%

    No Known Activations