INDEX
    Explanations

    phrases that contrast two different options

    phrases indicating contrast or comparison

    New Auto-Interp
    Negative Logits
    enegger
    -0.73
    inho
    -0.71
    ells
    -0.67
    boys
    -0.66
     Chips
    -0.64
     Loaded
    -0.63
    avan
    -0.63
    liam
    -0.63
    obyl
    -0.61
     Bras
    -0.61
    POSITIVE LOGITS
    itably
    0.87
     necessarily
    0.75
    isons
    0.69
     opposed
    0.69
    âĶĢâĶĢâĶĢâĶĢ
    0.69
    untarily
    0.68
     materially
    0.68
    entimes
    0.68
    viously
    0.67
     willingly
    0.67
    Act Density 0.010%

    No Known Activations