INDEX
    Explanations

    phrases related to comparison or contrast

    words indicating importance or significance

    New Auto-Interp
    Negative Logits
    but
    -0.66
    orie
    -0.65
    WT
    -0.63
    soDeliveryDate
    -0.62
    ructose
    -0.61
    schild
    -0.61
    iddle
    -0.61
     Travels
    -0.60
    ukong
    -0.60
     But
    -0.59
    POSITIVE LOGITS
     nonetheless
    1.46
     nevertheless
    1.18
    etheless
    1.07
     still
    0.77
     darn
    0.73
     anyway
    0.71
     awfully
    0.69
     strangely
    0.68
     overshadowed
    0.67
     anyways
    0.66
    Act Density 1.312%

    No Known Activations