INDEX
    Explanations

    phrases related to intentionally disregarding something

    instances of the word "ignore" and its variations

    New Auto-Interp
    Negative Logits
    unal
    -0.68
    uliffe
    -0.68
    ickr
    -0.67
    ramer
    -0.66
    ccording
    -0.65
     millenn
    -0.64
    Stars
    -0.63
    amide
    -0.62
    ikuman
    -0.62
    creation
    -0.62
    POSITIVE LOGITS
    ibly
    0.90
    ibility
    0.72
    lessly
    0.67
     underestimate
    0.64
     Sakuya
    0.64
    illy
    0.63
     aside
    0.63
     prejudice
    0.61
    fulness
    0.61
    erella
    0.60
    Act Density 0.028%

    No Known Activations