INDEX
    Explanations

    instances of contrasting or contrasting ideas

    the word "while" and its variations to indicate contrasting scenarios or conditions

    New Auto-Interp
    Negative Logits
    ilet
    -0.73
    Lay
    -0.66
    orthy
    -0.65
    bard
    -0.64
    atari
    -0.64
    INS
    -0.63
    aeda
    -0.63
    inated
    -0.63
    idy
    -0.62
    ANN
    -0.62
    POSITIVE LOGITS
     acknowledging
    0.94
     technically
    0.80
     respecting
    0.80
     conced
    0.77
     researching
    0.74
     admitting
    0.74
     imperfect
    0.72
    terness
    0.67
     shading
    0.67
    lihood
    0.66
    Act Density 0.070%

    No Known Activations