INDEX
    Explanations

    phrases related to avoiding something

    instances of the word "avoid" and related forms

    New Auto-Interp
    Negative Logits
    iop
    -0.81
    geist
    -0.70
    essee
    -0.69
    rooms
    -0.69
    cart
    -0.67
    song
    -0.65
    Rated
    -0.64
    otle
    -0.64
    RAW
    -0.63
    dy
    -0.63
    POSITIVE LOGITS
     detection
    0.79
     pitfalls
    0.71
    vana
    0.71
    nels
    0.71
    ably
    0.69
    ading
    0.69
     wasting
    0.68
    ption
    0.68
    hess
    0.67
     answering
    0.67
    Act Density 0.035%

    No Known Activations