INDEX
    Explanations

    phrases expressing value or importance

    evaluative statements about worth or significance

    New Auto-Interp
    Negative Logits
    but
    -0.87
     But
    -0.72
    But
    -0.67
    eatured
    -0.66
    hari
    -0.62
    ructose
    -0.62
    BUT
    -0.61
    schild
    -0.60
    ornia
    -0.60
     but
    -0.59
    POSITIVE LOGITS
     nonetheless
    1.85
     anyway
    1.14
     nevertheless
    1.10
     anyways
    1.05
    etheless
    1.05
     owing
    0.87
    .
    0.82
     insofar
    0.82
     because
    0.81
    .[
    0.77
    Act Density 0.951%

    No Known Activations