INDEX
    Explanations

    phrases related to contrasts or opposition

    phrases related to contrasting or conflicting ideas

    New Auto-Interp
    Negative Logits
     exit
    -0.65
     flirt
    -0.64
     Codec
    -0.64
     resort
    -0.63
     Burnett
    -0.63
     Dickinson
    -0.62
     indemn
    -0.62
     troop
    -0.60
     desp
    -0.60
     emoji
    -0.60
    POSITIVE LOGITS
    sized
    1.21
    based
    1.20
    turned
    1.14
    style
    1.13
    sama
    1.09
    inspired
    1.06
    themed
    1.06
    driven
    1.05
    type
    1.05
    worthy
    1.05
    Act Density 0.175%

    No Known Activations