INDEX
    Explanations

    instances of the word "remove" and its variations, indicating a focus on deletion or extraction

    New Auto-Interp
    Negative Logits
     bArr
    -0.78
    AsUp
    -0.75
    Portale
    -0.72
     Stuart
    -0.71
     Schuster
    -0.70
    pity
    -0.69
    };*/
    -0.66
     fact
    -0.65
    ?}",
    -0.63
    
    -0.63
    POSITIVE LOGITS
     Removal
    1.61
     REMOVE
    1.60
     Remove
    1.60
     removal
    1.59
    Remove
    1.54
     removals
    1.53
     REMOV
    1.53
     remove
    1.49
     removed
    1.47
     Removes
    1.47
    Act Density 0.079%

    No Known Activations