INDEX
    Explanations

    the word "but" in various contexts

    New Auto-Interp
    Negative Logits
     Berger
    -0.66
     antic
    -0.63
     Mandela
    -0.63
     Brett
    -0.63
     Holocaust
    -0.62
     Patri
    -0.62
     lies
    -0.62
     Rover
    -0.61
     ???
    -0.59
     Roose
    -0.58
    POSITIVE LOGITS
    sts
    0.99
    term
    0.95
    chery
    0.89
    tes
    0.89
    chel
    0.88
    Reviewer
    0.88
    chers
    0.87
    chie
    0.86
    aceous
    0.84
    ters
    0.84
    Act Density 0.063%

    No Known Activations